Squashed commit of the following:
commit 41ba8c36a90cc11723b14ce6c45599eabdcfaa53 Author: Stephen Diehl <stephen.m.diehl@gmail.com> Date: Sun Jan 18 21:02:57 2015 -0500 type provenance commit be5eda941bb4c44b4c4af0ddbbd793643938f4ff Author: Stephen Diehl <stephen.m.diehl@gmail.com> Date: Sun Jan 18 20:13:06 2015 -0500 provenance prototype commit 7aa958b9c279e7571f7c4887f6aa19443e16f6fb Author: Stephen Diehl <stephen.m.diehl@gmail.com> Date: Sun Jan 18 19:35:08 2015 -0500 fix misc typos commit 52d60b3b2630e50ef0cd6ea5f0fa1f308d92e26d Author: Stephen Diehl <stephen.m.diehl@gmail.com> Date: Sun Jan 18 15:15:58 2015 -0500 license badge commit 7d34274afe6f05a0002c8f87e5077b6a130b42b4 Author: Stephen Diehl <stephen.m.diehl@gmail.com> Date: Sun Jan 18 15:07:28 2015 -0500 fix resolution for llvm cfg graphs commit 14d9bc836ecc64f8e9acc60bcbd2da02335255b9 Author: Stephen Diehl <stephen.m.diehl@gmail.com> Date: Sun Jan 18 13:12:39 2015 -0500 added codegen dsl stub commit 0f74cdd6f95d0a1fe1cafd73e45cb1407709efd8 Author: Stephen Diehl <stephen.m.diehl@gmail.com> Date: Sun Jan 18 13:01:14 2015 -0500 llvm cfg graphs commit a199d721503985954060e7670c1d2f5e1a65dd11 Author: Stephen Diehl <stephen.m.diehl@gmail.com> Date: Sun Jan 18 10:56:54 2015 -0500 source code font commit c7db0c5d67b73d8633f08be093971877e2d6ede0 Author: Stephen Diehl <stephen.m.diehl@gmail.com> Date: Sun Jan 18 09:59:37 2015 -0500 change phrasing around recursion commit 6903700db482524233262e722df54b1066218250 Author: Stephen Diehl <stephen.m.diehl@gmail.com> Date: Sat Jan 17 18:20:06 2015 -0500 contributors.md commit 14d90a3f2ebf7ddf1229c084fe4a1e9fa13f2e41 Author: Stephen Diehl <stephen.m.diehl@gmail.com> Date: Sat Jan 17 17:35:41 2015 -0500 added llvm logo commit d270df6d94cbf1ef9eddfdd64af5aabc36ebca72 Author: Stephen Diehl <stephen.m.diehl@gmail.com> Date: Sat Jan 17 15:50:28 2015 -0500 initial llvm chapter commit e71b189c057ea9e399e90e47d9d49bb4cf12cda8 Author: Stephen Diehl <stephen.m.diehl@gmail.com> Date: Sat Jan 17 12:21:00 2015 -0500 system-f typing rules commit 2a7d5c7f137cf352eeae64836df634c98118f594 Author: Stephen Diehl <stephen.m.diehl@gmail.com> Date: Thu Jan 15 15:21:14 2015 -0500 flesh out system-f commit 7b3b2f0a2aea5e1102abe093cf5e0559090720aa Author: Stephen Diehl <stephen.m.diehl@gmail.com> Date: Wed Jan 14 22:22:14 2015 -0500 started on extended parser commit cdeaf1a2658f15346fe1dc665ca09e954cce6c2e Author: Stephen Diehl <stephen.m.diehl@gmail.com> Date: Wed Jan 14 17:25:02 2015 -0500 creative commons license commit f09d210be253a05fc8ad0827cd72ffa32404e2ba Author: Stephen Diehl <stephen.m.diehl@gmail.com> Date: Wed Jan 14 16:54:10 2015 -0500 higher res images commit 8555eadfea8843f5683621e6652857e4259fa896 Author: Stephen Diehl <stephen.m.diehl@gmail.com> Date: Wed Jan 14 14:48:44 2015 -0500 cover page commit e5e542e92610f4bb4c5ac726ffa86cd1e07753e3 Author: Stephen Diehl <stephen.m.diehl@gmail.com> Date: Tue Jan 13 17:31:01 2015 -0500 initial happy/alex parser
@ -1,14 +1,19 @@
|
||||
<div class="pagetitle">
|
||||
![](img/titles/introduction.png)
|
||||
</div>
|
||||
|
||||
******
|
||||
<p class="halfbreak">
|
||||
</p>
|
||||
|
||||
> When the limestone of imperative programming is worn away, the granite of
|
||||
> functional programming will be observed.
|
||||
<!--
|
||||
> *When the limestone of imperative programming is worn away, the granite of
|
||||
> functional programming will be observed.*
|
||||
>
|
||||
> <cite>— Simon Peyton Jones</cite>
|
||||
|
||||
<p class="halfbreak">
|
||||
</p>
|
||||
-->
|
||||
|
||||
Introduction
|
||||
============
|
||||
@ -151,42 +156,71 @@ refines the space of allowable behavior and degree of expressible programs for
|
||||
the language. Types are the world's most popular formal method for analyzing
|
||||
programs.
|
||||
|
||||
$$
|
||||
\begin{aligned}
|
||||
1 &: \t{Nat} \\
|
||||
(\lambda x . x) &: \forall a. a \to a \\
|
||||
(\lambda x y . x) &: \forall a b. a \to b \to a \\
|
||||
\end{aligned}
|
||||
$$
|
||||
In a language like Python all expressions have the same type at compile time,
|
||||
and all syntactically valid programs can be evaluated. In the case where the
|
||||
program is nonsensical the runtime will bubble up exceptions at runtime. The
|
||||
Python interpreter makes no attempt to analyze the given program for soundness
|
||||
at all before running it.
|
||||
|
||||
In more sophisticated languages types and terms will commingle either with
|
||||
explicit annotations on binders, or even as first class values themselves.
|
||||
```bash
|
||||
>>> True & "false"
|
||||
Traceback (most recent call last):
|
||||
File "<stdin>", line 1, in <module>
|
||||
TypeError: unsupported operand type(s) for &: 'bool' and 'str'
|
||||
```
|
||||
|
||||
$$
|
||||
\t{Pair} \ u \ v = \Lambda X . \lambda x^{U \rightarrow V \rightarrow X} . x u v
|
||||
$$
|
||||
By comparison Haskell will do quite a bit of work to try to ensure that the
|
||||
program is well-defined before running it. The language that we use to
|
||||
predescribe and analyze static semantics of the program is that of *static
|
||||
types*.
|
||||
|
||||
In all the languages which we will implement the types present during compilation are
|
||||
*erased*. Although they are present in the evaluation semantics, the runtime
|
||||
cannot dispatch on types of values at runtime. Types by definition only exist at
|
||||
compile-time in the static semantics of the language.
|
||||
```bash
|
||||
Prelude> True && "false"
|
||||
|
||||
<interactive>:2:9:
|
||||
Couldn't match expected type `Bool' with actual type `[Char]'
|
||||
In the second argument of `(&&)', namely `"false"'
|
||||
In the expression: True && "false"
|
||||
In an equation for `it': it = True && "false"
|
||||
```
|
||||
|
||||
Catching minor type mismatch errors is the simplest example of usage, although
|
||||
they occur extremely frequently as we humans are quite fallible in our reasoning
|
||||
about even the simplest of program constructions! Although this just the tip of
|
||||
the iceberg, the gradual trend over the last 20 years toward more *expressive
|
||||
types* in modern type systems; which are capable of guaranteeing a large variety
|
||||
of program correctness properties.
|
||||
|
||||
* Preventing resource allocation errors.
|
||||
* Enforcing security access for program logic.
|
||||
* Side effect management.
|
||||
* Preventing buffer overruns.
|
||||
* Ensuring cryptographic properties for network protocols.
|
||||
* Modeling and verify theorems in mathematics and logic.
|
||||
* Preventing data races and deadlocks in concurrent systems.
|
||||
|
||||
Type systems can never capture all possible behavior of the program. Although
|
||||
more sophisticated type systems are increasingly able to model a large space of
|
||||
behavior and is one of the most exciting areas of modern computer science
|
||||
research. Put most bluntly, **static types let you be dumb** and offload the
|
||||
checking that you would otherwise have to do in your head to a system that can
|
||||
do the reasoning for you and work with you to interactively build your program.
|
||||
|
||||
Functional Compilers
|
||||
--------------------
|
||||
|
||||
A compiler is typically divided into parts, a *frontend* and a *backend*. These
|
||||
are loose terms but the frontend typically deals with converting the human
|
||||
representation of the code into some canonicalized form while the backend
|
||||
converts the canonicalized form into another form that is suitable for
|
||||
evaluation.
|
||||
A *compiler* is a program for turning high-level representation of ideas in a
|
||||
human readable language into another form. A compiler is typically divided into
|
||||
parts, a *frontend* and a *backend*. These are loose terms but the frontend
|
||||
typically deals with converting the human representation of the code into some
|
||||
canonicalized form while the backend converts the canonicalized form into
|
||||
another form that is suitable for evaluation.
|
||||
|
||||
The high level structure of our functional compiler is described by the
|
||||
following *block diagram*. Each describes a *phase* which is a sequence of
|
||||
transformations composed to transform the input program.
|
||||
|
||||
<p class="center">
|
||||
![](img/pipeline1.png)
|
||||
</p>
|
||||
|
||||
* **Source** - The frontend textual source language.
|
||||
* **Parsing** - Source is parsed into an abstract syntax tree.
|
||||
@ -200,15 +234,13 @@ A *pass* may transform the input program from one form into another or alter the
|
||||
internal state of the compiler context. The high level description of the forms
|
||||
our final compiler will go through is the following sequence:
|
||||
|
||||
<p class="center">
|
||||
![](img/pipeline2.png)
|
||||
</p>
|
||||
|
||||
Internal forms used during compilation are *intermediate representations* and
|
||||
typically any non-trivial language will involve several.
|
||||
|
||||
Lexing
|
||||
------
|
||||
Parsing
|
||||
-------
|
||||
|
||||
The source code is simply the raw sequence of text that specifies the program.
|
||||
Lexing splits the text stream into a sequence of *tokens*. Only the presence of
|
||||
@ -222,22 +254,22 @@ let f x = x + 1
|
||||
For instance the previous program might generate a token stream like the
|
||||
following:
|
||||
|
||||
Token Value
|
||||
----- -----
|
||||
reserved let
|
||||
var f
|
||||
var x
|
||||
reservedOp =
|
||||
var x
|
||||
reservedOp +
|
||||
integer 1
|
||||
```haskell
|
||||
[
|
||||
TokenLet,
|
||||
TokenSym "f",
|
||||
TokenSym "x",
|
||||
TokenEq,
|
||||
TokenSym "x",
|
||||
TokenAdd,
|
||||
TokenNum 1
|
||||
]
|
||||
```
|
||||
|
||||
Parsing
|
||||
-------
|
||||
|
||||
A datatype for the *abstract syntax tree* (AST) is constructed by traversal of
|
||||
the input stream and generation of the appropriate syntactic construct using a
|
||||
parser.
|
||||
We can then scan the token stream via and dispatch on predefined patterns of
|
||||
tokens called *productions* and recursively build up a datatype for the
|
||||
*abstract syntax tree* (AST) by traversal of the input stream and generation of
|
||||
the appropriate syntactic.
|
||||
|
||||
```haskell
|
||||
type Name = String
|
||||
@ -319,6 +351,19 @@ Let "f" []
|
||||
(Lit (LitInt 1))))
|
||||
```
|
||||
|
||||
Transformation
|
||||
--------------
|
||||
|
||||
The type core representation is often suitable for evaluation, but quite often
|
||||
different intermediate representations are more amenable to certain
|
||||
optimizations and make explicit semantic properties of the language explicit.
|
||||
These kind of intermediate forms will often attach information about free
|
||||
variables, allocations, and usage information directly onto the AST to make it
|
||||
|
||||
The most important form we will use is called the *Spineless Tagless G-Machine*
|
||||
( STG ), an abstract machine that makes many of the properties of lazy
|
||||
evaluation explicit directly in the AST.
|
||||
|
||||
Code Generation
|
||||
---------------
|
||||
|
||||
|
@ -1,6 +1,6 @@
|
||||
<div class="pagetitle">
|
||||
![](img/titles/basics.png)
|
||||
|
||||
******
|
||||
</div>
|
||||
|
||||
<!--
|
||||
<blockquote>
|
||||
@ -33,14 +33,12 @@ add :: Integer -> Integer -> Integer
|
||||
add x y = x + y
|
||||
```
|
||||
|
||||
```haskell
|
||||
add (x,y) = x + y
|
||||
```
|
||||
In Haskell all functions are pure, the only thing a function may do is return a
|
||||
value.
|
||||
|
||||
In Haskell all functions are pure, the only thing a function may do is return a value.
|
||||
|
||||
All functions in Haskell are curried, for example a function of three arguments takes up to three arguments and for
|
||||
anything less than three it yields a partially applied function which when given additional arguments yields a
|
||||
All functions in Haskell are curried, for example a function of three arguments
|
||||
takes up to three arguments and for anything less than three it yields a
|
||||
partially applied function which when given additional arguments yields a
|
||||
another function or the resulting value if saturated.
|
||||
|
||||
```haskell
|
||||
@ -51,7 +49,8 @@ h :: Int -> Int
|
||||
h = g 2 3
|
||||
```
|
||||
|
||||
Haskell supports higher-order functions, functions which take functions and yield other functions.
|
||||
Haskell supports higher-order functions, functions which take functions and
|
||||
yield other functions.
|
||||
|
||||
```haskell
|
||||
compose f g = \x -> f (g x)
|
||||
@ -86,10 +85,7 @@ constructors also generates special set of functions known as *selectors* which
|
||||
extract the values of a specific field from the record.
|
||||
|
||||
```haskell
|
||||
data Prod = Prod
|
||||
{ a :: Int
|
||||
, b :: Bool
|
||||
}
|
||||
data Prod = Prod { a :: Int , b :: Bool }
|
||||
|
||||
-- a :: Prod -> Int
|
||||
-- b :: Prod -> Bool
|
||||
@ -158,7 +154,7 @@ Tuples are allowed (with compiler support) up to 15 fields in GHC.
|
||||
Pattern matching
|
||||
----------------
|
||||
|
||||
Pattern matching allows us to discriminate on the constructor(s) of a datatype,
|
||||
Pattern matching allows us to discriminate on the constructors of a datatype,
|
||||
mapping separate cases to separate code paths.
|
||||
|
||||
```haskell
|
||||
@ -216,12 +212,17 @@ Recursion
|
||||
In Haskell all iteration over data structures is performed by recursion.
|
||||
Entering a function in Haskell does not create a new stack frame, the logic of
|
||||
the function is simply entered with the arguments on the stack and yields result
|
||||
to the register. The resulting logic is compiled identically to ``while`` loops
|
||||
in other languages, via a ``jmp`` instruction instead of a ``call``.
|
||||
to the register. In the case where a function returns a invocation of itself
|
||||
invoked in the *tail position* the resulting logic is compiled identically to
|
||||
``while`` loops in other languages, via a ``jmp`` instruction instead of a
|
||||
``call``.
|
||||
|
||||
```haskell
|
||||
factorial 0 = 1
|
||||
factorial n = n * factorial (n - 1)
|
||||
sum :: [Int] -> [Int]
|
||||
sum ys = go ys 0
|
||||
where
|
||||
go (x:xs) i = go xs (i+x)
|
||||
go [] i = i
|
||||
```
|
||||
|
||||
Functions can be defined to recurse mutually on each other.
|
||||
@ -421,19 +422,19 @@ all monad instances must satisfy.
|
||||
**Law 1**
|
||||
|
||||
```haskell
|
||||
return a >>= f ≡ f a
|
||||
return a >>= f = f a
|
||||
```
|
||||
|
||||
**Law 2**
|
||||
|
||||
```haskell
|
||||
m >>= return ≡ m
|
||||
m >>= return = m
|
||||
```
|
||||
|
||||
**Law 3**
|
||||
|
||||
```haskell
|
||||
(m >>= f) >>= g ≡ m >>= (\x -> f x >>= g)
|
||||
(m >>= f) >>= g = m >>= (\x -> f x >>= g)
|
||||
```
|
||||
|
||||
Haskell has a level of syntactic sugar for monads known as do-notation. In this
|
||||
@ -441,9 +442,9 @@ form binds are written sequentially in block form which extract the variable
|
||||
from the binder.
|
||||
|
||||
```haskell
|
||||
do { a <- f ; m } ≡ f >>= \a -> do { m }
|
||||
do { f ; m } ≡ f >> do { m }
|
||||
do { m } ≡ m
|
||||
do { a <- f ; m } = f >>= \a -> do { m }
|
||||
do { f ; m } = f >> do { m }
|
||||
do { m } = m
|
||||
```
|
||||
|
||||
So for example the following are equivalent.
|
||||
@ -504,6 +505,16 @@ discards the left while ``<*`` discards the right. For example in a monadic
|
||||
parser combinator library the ``*>`` would parse with first parser argument but
|
||||
return the second.
|
||||
|
||||
Monoids
|
||||
-------
|
||||
|
||||
```haskell
|
||||
class Monoid a where
|
||||
mempty :: a
|
||||
mappend :: a -> a -> a
|
||||
mconcat :: [a] -> a
|
||||
```
|
||||
|
||||
Deriving
|
||||
--------
|
||||
|
||||
@ -638,9 +649,7 @@ evalStack m = execWriterT (evalStateT (unStack m) 0)
|
||||
|
||||
As illustrated by the following stack diagram:
|
||||
|
||||
<p class="center">
|
||||
![](img/stack.png)
|
||||
</p>
|
||||
|
||||
Using mtl and ``GeneralizedNewtypeDeriving`` we can produce the same stack but with
|
||||
a simpler forward facing interface to the transformer stack. Under the hood mtl
|
||||
|
@ -1,6 +1,6 @@
|
||||
<div class="pagetitle">
|
||||
![](img/titles/parsing.png)
|
||||
|
||||
******
|
||||
</div>
|
||||
|
||||
<!--
|
||||
> The tools we use have a profound (and devious!) influence on our thinking habits, and, therefore, on our thinking abilities.
|
||||
@ -29,7 +29,7 @@ NanoParsec
|
||||
So now let's build our own toy parser combinator library which we'll call
|
||||
**NanoParsec** just to get the feel of how these things are built.
|
||||
|
||||
~~~~ {.haskell slice="chapter3/parsec.hs" lower=1 upper=8}
|
||||
~~~~ {.haskell slice="chapter3/parsec.hs" lower=0 upper=7}
|
||||
~~~~
|
||||
|
||||
Structurally a parser is a function which takes an input stream of characters
|
||||
@ -37,7 +37,7 @@ and yields an parse tree by applying the parser logic over sections of the
|
||||
character stream (called *lexemes*) to build up a composite data structure for
|
||||
the AST.
|
||||
|
||||
~~~~ {.haskell slice="chapter3/parsec.hs" lower=8 upper=9}
|
||||
~~~~ {.haskell slice="chapter3/parsec.hs" lower=8 upper=8}
|
||||
~~~~
|
||||
|
||||
Running the function will result in traversing the stream of characters yielding
|
||||
@ -89,7 +89,7 @@ and concatenates the result. Together these give rise to both the Alternative
|
||||
and MonadPlus class instances which encode the logic for trying multiple parse
|
||||
functions over the same stream and handling failure and rollover.
|
||||
|
||||
The core operator introduced here is (``(<|>)``) operator for combining two
|
||||
The core operator introduced here is (``<|>``) operator for combining two
|
||||
optional paths of parser logic, switching to second path if the first fails with
|
||||
the zero value.
|
||||
|
||||
@ -174,10 +174,6 @@ $ runhaskell parsec.hs
|
||||
7
|
||||
```
|
||||
|
||||
See:
|
||||
|
||||
* [Monads for functional programming](http://homepages.inf.ed.ac.uk/wadler/papers/marktoberdorf/baastad.pdf)
|
||||
|
||||
**Generalizing String**
|
||||
|
||||
The limitations of the String type are well-known, but what is particularly nice
|
||||
|
@ -1,12 +1,11 @@
|
||||
<div class="pagetitle">
|
||||
![](img/titles/lambda_calculus.png)
|
||||
</div>
|
||||
|
||||
******
|
||||
|
||||
<!--
|
||||
> That language is an instrument of human reason, and not merely a
|
||||
> medium for the expression of thought, is a truth generally admitted.
|
||||
> *That language is an instrument of human reason, and not merely a medium for
|
||||
> the expression of thought, is a truth generally admitted.*
|
||||
>
|
||||
> <cite>— George Boole</cite>
|
||||
-->
|
||||
|
||||
<p class="halfbreak">
|
||||
</p>
|
||||
@ -19,14 +18,15 @@ composition, function abstraction of a single variable. The **lambda calculus**
|
||||
consists very simply of three terms and all valid recursive combinations
|
||||
thereof:
|
||||
|
||||
![](img/lambda.png)
|
||||
|
||||
The terms are named are typically referred to in code by the following
|
||||
contractions.
|
||||
|
||||
- **Var** - A variable
|
||||
- **Lam** - A lambda abstraction
|
||||
- **App** - An application
|
||||
|
||||
<p class="center">
|
||||
![](img/lambda.png)
|
||||
</p>
|
||||
|
||||
$$
|
||||
\begin{aligned}
|
||||
e :=\ & x & \trule{Var} \\
|
||||
@ -183,15 +183,8 @@ choose the Haskell convention which denotes lambda by the backslash (``\``) to
|
||||
the body with (``->``), and application by spaces. Named variables are simply
|
||||
alphanumeric sequences of characters.
|
||||
|
||||
Logical notation:
|
||||
|
||||
$\mathtt{const} = \lambda x y . x$
|
||||
|
||||
Haskell notation:
|
||||
|
||||
```haskell
|
||||
const = \x y -> x
|
||||
```
|
||||
* **Logical notation**: $\mathtt{const} = \lambda x y . x$
|
||||
* **Haskell notation**: ``const = \x y -> x``
|
||||
|
||||
In addition other terms like literal numbers or booleans can be added, and these
|
||||
make writing expository examples a little easier. In addition we will add a
|
||||
|
@ -1,12 +1,15 @@
|
||||
<div class="pagetitle">
|
||||
![](img/titles/type_systems.png)
|
||||
</div>
|
||||
|
||||
******
|
||||
<p class="halfbreak">
|
||||
</p>
|
||||
|
||||
> [A type system is a] tractable syntactic method for proving the
|
||||
> *[A type system is a] tractable syntactic method for proving the
|
||||
> absence of certain program behaviors by classifying phrases
|
||||
> according to the kinds of values they compute.
|
||||
> according to the kinds of values they compute.*
|
||||
>
|
||||
> <cite>- Benjamin Pierce</cite>
|
||||
> <cite>— Benjamin Pierce</cite>
|
||||
|
||||
<p class="halfbreak">
|
||||
</p>
|
||||
@ -194,6 +197,11 @@ $$
|
||||
\tau_1 \to \tau_2 \to \tau_3 \to \tau_4 \quad = \quad \tau_1 \to (\tau_2 \to (\tau_3 \to \tau_4))
|
||||
$$
|
||||
|
||||
In all the languages which we will implement the types present during
|
||||
compilation are *erased*. Although types are possibly present in the evaluation
|
||||
semantics, the runtime cannot dispatch on types of values at runtime. Types by
|
||||
definition only exist at compile-time in the static semantics of the language.
|
||||
|
||||
Small-Step Semantics
|
||||
--------------------
|
||||
|
||||
@ -603,18 +611,16 @@ the space of all programs and draw a large line around the universe of discourse
|
||||
programs that we are willing to consider, since these are the only programs that
|
||||
we can prove properties for.
|
||||
|
||||
> Well-typed programs don't go wrong, but not every program that never goes wrong
|
||||
> *Well-typed programs don't go wrong, but not every program that never goes wrong
|
||||
> is well-typed. It's easy to exhibit programs that don't go wrong but are
|
||||
> ill-typed in ... any ... decidable type system. Many such programs are useful,
|
||||
> which is why dynamically-typed languages like Erlang and Lisp are justly
|
||||
> popular.
|
||||
> popular.*
|
||||
>
|
||||
> <cite>-Simon Peyton Jones</cite>
|
||||
> <cite>— Simon Peyton Jones</cite>
|
||||
|
||||
<!--
|
||||
<p class="center">
|
||||
![](img/abysmal_pain.png)
|
||||
</p>
|
||||
-->
|
||||
|
||||
Power always comes at a price. Using one system you can do more things. In
|
||||
|
@ -1,10 +1,10 @@
|
||||
<div class="pagetitle">
|
||||
![](img/titles/evaluation.png)
|
||||
</div>
|
||||
|
||||
******
|
||||
|
||||
> Well-typed programs cannot "go wrong".
|
||||
> *Well-typed programs cannot "go wrong".*
|
||||
>
|
||||
> <cite>-Robin Milner</cite>
|
||||
> <cite>— Robin Milner</cite>
|
||||
|
||||
<p class="halfbreak">
|
||||
</p>
|
||||
@ -22,8 +22,8 @@ lambda expression are necessarily evaluated before a lambda is reduced. A
|
||||
language in which the arguments are not necessarily evaluated before a lambda is
|
||||
reduced is non-strict.
|
||||
|
||||
Alternatively expressed, diverging terms are represented by the *bottom* value,
|
||||
written as $\bot$. A function $f$ is non-strict if:
|
||||
Alternatively expressed, diverging terms are represented up to equivalence by
|
||||
the *bottom* value, written as $\bot$. A function $f$ is non-strict if:
|
||||
|
||||
$$
|
||||
f \bot \neq \bot
|
||||
@ -81,9 +81,9 @@ how the subexpression ``(2 + 2)`` is evaluated to normal form before being
|
||||
bound.
|
||||
|
||||
```haskell
|
||||
(λx. λy. y x) (2 + 2) λx. x + 1
|
||||
=> (λy. y 4) λx. x + 1
|
||||
=> (λy. x + 1) 4
|
||||
(\x. \y. y x) (2 + 2) λx. x + 1
|
||||
=> (\y. y 4) \x. x + 1
|
||||
=> (\y. x + 1) 4
|
||||
=> 4 + 1
|
||||
=> 5
|
||||
```
|
||||
@ -169,16 +169,15 @@ For example, the same expression we looked at for call-by-value has the same
|
||||
normal form but arrives at it by a different sequence of reductions:
|
||||
|
||||
```haskell
|
||||
(λx. λy. y x) (2 + 2) λx. x + 1
|
||||
=> (λy.y (2 + 2)) λx. x + 1
|
||||
=> (λx.x + 1) (2 + 2)
|
||||
(\x. \y. y x) (2 + 2) \x. x + 1
|
||||
=> (\y.y (2 + 2)) λx. x + 1
|
||||
=> (\x.x + 1) (2 + 2)
|
||||
=> (2 + 2) + 1
|
||||
=> 4 + 1
|
||||
=> 5
|
||||
```
|
||||
|
||||
Call-by-name is non-strict, although very few languages use this model,
|
||||
[Frege](https://github.com/Frege/frege) being the most notable example.
|
||||
Call-by-name is non-strict, although very few languages use this model.
|
||||
|
||||
Call-by-need
|
||||
------------
|
||||
|
@ -1,8 +1,11 @@
|
||||
<div class="pagetitle">
|
||||
![](img/titles/hindley_milner.png)
|
||||
</div>
|
||||
|
||||
******
|
||||
<p class="halfbreak">
|
||||
</p>
|
||||
|
||||
> There is nothing more practical than a good theory.
|
||||
> *There is nothing more practical than a good theory.*
|
||||
>
|
||||
> <cite>— James C. Maxwell</cite>
|
||||
|
||||
@ -20,7 +23,7 @@ program give rise to a set of constraints that when solved always have a unique
|
||||
*principal type*.
|
||||
|
||||
The simplest Hindley Milner type system is defined by a very short set of rules.
|
||||
The first four rules describe the judgements by which we can each syntactic
|
||||
The first four rules describe the judgements by which we can map each syntactic
|
||||
construct (``Lam``, ``App``, ``Var``, ``Let``) to their expected types. We'll
|
||||
elaborate on these rules shortly.
|
||||
|
||||
@ -100,6 +103,51 @@ let S f g x = f x (g x);
|
||||
As before ``let rec`` expressions will expand out in terms of the fixpoint
|
||||
operator and are just syntactic sugar.
|
||||
|
||||
Polymorphism
|
||||
------------
|
||||
|
||||
We will add an additional constructs to our language that will admit a new form
|
||||
of *polymorphism* for our language. Polymorphism is property of a term to
|
||||
simultaneously admit several distinct types for the same function
|
||||
implementation.
|
||||
|
||||
For instance the polymorphic signature for the identity function maps a input of
|
||||
type $\alpha$
|
||||
|
||||
$$
|
||||
\begin{aligned}
|
||||
\mathtt{id}\ & ::\ \forall \alpha. \alpha \to \alpha \\
|
||||
\mathtt{id}\ & =\ \lambda x : \alpha.\ x
|
||||
\end{aligned}
|
||||
$$
|
||||
|
||||
Now instead of having to duplicate the functionality for every possible type
|
||||
(i.e. implementing idInt, idBool, ...) we our type system admits any
|
||||
instantiation that is subsumed by the polymorphic type signature.
|
||||
|
||||
$$
|
||||
\begin{aligned}
|
||||
& \t{id}_\t{Int} = \t{Int} \to \t{Int} \\
|
||||
& \t{id}_\t{Bool} = \t{Bool} \to \t{Bool} \\
|
||||
\end{aligned}
|
||||
$$
|
||||
|
||||
A rather remarkably fact of universal quantification is that many properties
|
||||
about inhabitants of a type are guaranteed by construction, these are the
|
||||
so-called *free theorems*. For instance the only (nonpathological)
|
||||
implementation that can inhabit a function of type ``(a, b) -> a`` is an
|
||||
implementation precisely identical to that of ``fst``.
|
||||
|
||||
A slightly less trivial example is that of the ``fmap`` function of type
|
||||
``Functor f => (a -> b) -> f a -> f b``. The second functor law states that.
|
||||
|
||||
```haskell
|
||||
forall f g. fmap f . fmap g = fmap (f . g)
|
||||
```
|
||||
|
||||
However it is impossible to write down a (nonpathological) function for ``fmap``
|
||||
that was well-typed and didn't have this property. We get the theorem for free!
|
||||
|
||||
Types
|
||||
-----
|
||||
|
||||
@ -121,11 +169,10 @@ typeInt = TCon "Int"
|
||||
typeBool = TCon "Bool"
|
||||
```
|
||||
|
||||
However we will add an additional construct that will admit a new form of
|
||||
*polymorphism* for our language. *Type schemes* model polymorphic types, they
|
||||
indicate that the type variables bound in quantifier are polymorphic across the
|
||||
enclosed type and can be instantiated with any type consistent with the
|
||||
signature.
|
||||
*Type schemes* model polymorphic types, they indicate that the type variables
|
||||
bound in quantifier are polymorphic across the enclosed type and can be
|
||||
instantiated with any type consistent with the signature. Intuitively the
|
||||
indicate that the implementation of the function
|
||||
|
||||
```haskell
|
||||
data Scheme = Forall [TVar] Type
|
||||
@ -229,8 +276,8 @@ $$
|
||||
\begin{aligned}
|
||||
\FTV{\alpha} &= \{ \alpha \} \\
|
||||
\FTV{\tau_1 \rightarrow \tau_2} &= \FTV{\tau_1} \cup \FTV{\tau_2} \\
|
||||
\FTV{\t{Int}} &= \emptyset \\
|
||||
\FTV{\t{Bool}} &= \emptyset \\
|
||||
\FTV{\t{Int}} &= \varnothing \\
|
||||
\FTV{\t{Bool}} &= \varnothing \\
|
||||
\FTV{\forall x. t} &= \FTV{t} - \{ x \} \\
|
||||
\end{aligned}
|
||||
$$
|
||||
@ -342,9 +389,9 @@ s := [n_0 / m_0, n_1 / m_1, ..., n_k / m_k] \\
|
||||
$$
|
||||
|
||||
Two terms are said to be *unifiable* if there exists a unifying substitution set
|
||||
between them. A substitution set is said to be **confluent** if the application
|
||||
of substitutions is independent of the order applied, i.e. if we always arrive
|
||||
at the same normal form regardless of the order of substitution chosen.
|
||||
between them. A substitution set is said to be *confluent* if the application of
|
||||
substitutions is independent of the order applied, i.e. if we always arrive at
|
||||
the same normal form regardless of the order of substitution chosen.
|
||||
|
||||
The notation we'll adopt for unification is, read as two types $\tau, \tau'$ are
|
||||
unifiable by a substitution $s$.
|
||||
@ -1206,7 +1253,7 @@ Finally tab completion for our shell will use the interpreter's typing
|
||||
environment keys to complete on the set of locally defined variables. Repline
|
||||
supports prefix based tab completion where the prefix of the current command
|
||||
will be used to determine what to tab complete. In the case where we start with
|
||||
the command ":load" we will instead tab complete on filenames in the current
|
||||
the command ``:load`` we will instead tab complete on filenames in the current
|
||||
working directly instead.
|
||||
|
||||
```haskell
|
||||
|
45
007_path.md
@ -1,6 +1,6 @@
|
||||
<div class="pagetitle">
|
||||
![](img/titles/protohaskell.png)
|
||||
|
||||
******
|
||||
</div>
|
||||
|
||||
<!--
|
||||
> Functional languages are unnatural to use. [...] The important question is
|
||||
@ -115,6 +115,11 @@ Things we will not implement are:
|
||||
* Defaulting rules
|
||||
* Exceptions
|
||||
* Parallelism
|
||||
* Software Transactional Memory
|
||||
* Foreign Function Interface
|
||||
|
||||
Now if one feels so inclined one could of course implement these features on top
|
||||
our final language, but they are left as an exercise to the reader!
|
||||
|
||||
This of course begs the question of whether or not our language is "a Haskell".
|
||||
In the strictest sense, it will not be since it doesn't fully conform to either
|
||||
@ -131,16 +136,16 @@ Intermediate Forms
|
||||
|
||||
The passes between each of the phases make up the main *compilation pipeline* .
|
||||
|
||||
<p class="center">
|
||||
|
||||
![](img/proto_pass.png)
|
||||
</p>
|
||||
|
||||
|
||||
For *ProtoHaskell* our pipeline consists of the transitions between four
|
||||
intermediate forms of the program.
|
||||
|
||||
<p class="center">
|
||||
|
||||
![](img/protohaskell.png)
|
||||
</p>
|
||||
|
||||
|
||||
* The **Source**, the textual representation of the program from a file or user
|
||||
input. This is stored in a ``Text`` type.
|
||||
@ -149,8 +154,7 @@ intermediate forms of the program.
|
||||
after type inference.
|
||||
* The **PHOAS**, the type-erased Core is transformed into Haskell expressions
|
||||
by mapping lambda expressions in our language directly into Haskell lambda
|
||||
expressions and then evaluated using the Haskell runtime. This is simplest way
|
||||
of implementing a small interpreter.
|
||||
expressions and then evaluated using the Haskell runtime.
|
||||
|
||||
Pass Rep Haskell Type
|
||||
-------- -------- ---------
|
||||
@ -269,6 +273,8 @@ Compiling module: prelude.fun
|
||||
3
|
||||
λ> :type (>>=)
|
||||
(>>=) :: Monad m => m a -> (a -> m b) -> m b
|
||||
λ> :set -ddump-rn
|
||||
λ> :load test.fun
|
||||
```
|
||||
|
||||
Command line conventions will follow the Haskell's naming conventions. There
|
||||
@ -900,6 +906,11 @@ Data Declarations
|
||||
Data declarations are named block of various *ConDecl* constructors for each of
|
||||
the fields or constructors of a user-defined datatype.
|
||||
|
||||
```haskell
|
||||
data qname [var] where
|
||||
[tydecl]
|
||||
```
|
||||
|
||||
```haskell
|
||||
data Unit where
|
||||
Unit :: Unit
|
||||
@ -926,6 +937,10 @@ if there is a sequence of where statements these are also attached directly to
|
||||
the declaration, and will later be desugared away into local let statements
|
||||
across the body of the function.
|
||||
|
||||
```haskell
|
||||
qname [pat] = rhs [where decls]
|
||||
```
|
||||
|
||||
```haskell
|
||||
const x y = x
|
||||
```
|
||||
@ -997,6 +1012,10 @@ Fixity declarations are exceedingly simple, the store either arity of the
|
||||
declaration along with its associativity (Left, Right, Non-Associative) and the
|
||||
infix symbol.
|
||||
|
||||
```haskell
|
||||
[infixl|infixr|infix] [integer] ops;
|
||||
```
|
||||
|
||||
```haskell
|
||||
infixl 4 +;
|
||||
```
|
||||
@ -1009,17 +1028,17 @@ FixityDecl
|
||||
Typeclass Declarations
|
||||
----------------------
|
||||
|
||||
Consider a very simplified ``Num`` class. Typeclass declarations consist simply
|
||||
of the list of typeclass constraints, the name of the class, and the type
|
||||
variable ( single parameter only ). The body of the class is simply a sequence
|
||||
of scoped ``FunDecl`` declarations with only the ``matchType`` field.
|
||||
Typeclass declarations consist simply of the list of typeclass constraints, the
|
||||
name of the class, and the type variable ( single parameter only ). The body of
|
||||
the class is simply a sequence of scoped ``FunDecl`` declarations with only the
|
||||
``matchType`` field.
|
||||
|
||||
```haskell
|
||||
class [context] => classname [var] where
|
||||
[body]
|
||||
```
|
||||
|
||||
For example:
|
||||
Consider a very simplified ``Num`` class.
|
||||
|
||||
```haskell
|
||||
class Num a where
|
||||
|
591
008_extended_parser.md
Normal file
@ -0,0 +1,591 @@
|
||||
<div class="pagetitle">
|
||||
![](img/titles/extended_parser.png)
|
||||
</div>
|
||||
|
||||
<p class="halfbreak">
|
||||
</p>
|
||||
|
||||
Extended Parser
|
||||
===============
|
||||
|
||||
Up until now we've been using parser combinators to build our parsers. Parser
|
||||
combinators are a top-down parser formally in the $\mathtt{LL}(k)$ family of
|
||||
parsers. The parser proceeds top-down, with a sequence of $k$ characters used to
|
||||
dispatch on the leftmost production rule. Combined with backtracking (i.e. try
|
||||
combinator) this is simultaneously both an extremely powerful and simple model
|
||||
to implement as we saw before with our simple 100 line parser library.
|
||||
|
||||
However there are a family of grammars that include left-recursion that
|
||||
$\mathtt{LL}(k)$ can be inefficient and often incapable of parsing.
|
||||
Left-recursive rules are the case where the left-most symbol of the rule
|
||||
recurses on itself. For example:
|
||||
|
||||
$$
|
||||
\begin{aligned}
|
||||
e ::=\ e\ \t{op}\ \t{atom}
|
||||
\end{aligned}
|
||||
$$
|
||||
|
||||
Now we demonstrated a way before that we could handle these cases using the
|
||||
parser combinator ``chainl1`` function, and while this is possible sometimes it
|
||||
can in many cases be inefficient use of parser stack and lead to ambiguous
|
||||
cases.
|
||||
|
||||
The other major family of parsers $\mathtt{LR}$ are not plagued with the same
|
||||
concerns over left recursion. On the other hand $\mathtt{LR}$ parser are
|
||||
exceedingly more complicated to implement, relying on a rather sophisticated
|
||||
method known as Tomita's algorithm to do the heavy lifting. The tooling can
|
||||
around the construction of the *production rules* in a form that can be handled
|
||||
by the algorithm is often handled a DSL that generates the code for the parser.
|
||||
While the tooling is fairly robust, there is a level of indirection between us
|
||||
and the code that can often be a bit of brittle to extend with custom logic.
|
||||
|
||||
The most common form of this toolchain is the Lex/Yacc lexer and parser
|
||||
generator which compile into efficient C parsers for $\mathtt{LR}$ grammars.
|
||||
Haskell's **Happy** and **Alex** are roughly the Haskell equivalent of these
|
||||
tools.
|
||||
|
||||
Toolchain
|
||||
---------
|
||||
|
||||
Our parser logic will be spread across two different modules.
|
||||
|
||||
* Lexer.x
|
||||
* Parser.y
|
||||
|
||||
The code in each of these modules is a hybrid of the specific Alex/Happy grammar
|
||||
syntax and arbitrary Haskell logic that is spliced in. Code delineated by braces
|
||||
(``{}``) is regular Haskell, while code outside is parser/lexer logic.
|
||||
|
||||
|
||||
```haskell
|
||||
-- **Begin Haskell Syntax**
|
||||
{
|
||||
{-# OPTIONS_GHC -w #-}
|
||||
|
||||
module Lexer (
|
||||
Token(..),
|
||||
scanTokens
|
||||
) where
|
||||
|
||||
import Syntax
|
||||
}
|
||||
-- **End Haskell Syntax**
|
||||
|
||||
-- **Begin Alex Syntax**
|
||||
%wrapper "basic"
|
||||
|
||||
$digit = 0-9
|
||||
$alpha = [a-zA-Z]
|
||||
$eol = [\n]
|
||||
-- **End Alex Syntax**
|
||||
```
|
||||
|
||||
The files will be used during the code generation of the two modules ``Lexer``
|
||||
and ``Parser``. The toolchain is accessible in several ways, first via the
|
||||
command-line tools ``alex`` and ``happy`` will will generate the resulting
|
||||
modules by passing the appropriate input file to the tool.
|
||||
|
||||
```haskell
|
||||
$ alex Lexer.x # Generates Lexer.hs
|
||||
$ happy Parser.y # Generates Parser.hs
|
||||
```
|
||||
|
||||
Or inside of the cabal file using the ``build-tools`` command.
|
||||
|
||||
```haskell
|
||||
Build-depends: base, array
|
||||
build-tools: alex, happy
|
||||
other-modules:
|
||||
Parser,
|
||||
Lexer
|
||||
```
|
||||
|
||||
So the resulting structure of our interpreter will have the following set of
|
||||
files.
|
||||
|
||||
* **Lexer.hs**
|
||||
* **Parser.hs**
|
||||
* Eval.hs
|
||||
* Main.hs
|
||||
* Syntax.hs
|
||||
|
||||
Alex
|
||||
----
|
||||
|
||||
Our lexer module will export our Token definition and a function for converting
|
||||
an arbitrary String into a *token stream* or a list of Tokens.
|
||||
|
||||
```haskell
|
||||
{
|
||||
module Lexer (
|
||||
Token(..),
|
||||
scanTokens
|
||||
) where
|
||||
|
||||
import Syntax
|
||||
}
|
||||
```
|
||||
|
||||
The tokens are simply an enumeration of the unique possible tokens in our
|
||||
grammar.
|
||||
|
||||
```haskell
|
||||
data Token
|
||||
= TokenLet
|
||||
| TokenTrue
|
||||
| TokenFalse
|
||||
| TokenIn
|
||||
| TokenLambda
|
||||
| TokenNum Int
|
||||
| TokenSym String
|
||||
| TokenArrow
|
||||
| TokenEq
|
||||
| TokenAdd
|
||||
| TokenSub
|
||||
| TokenMul
|
||||
| TokenLParen
|
||||
| TokenRParen
|
||||
| TokenEOF
|
||||
deriving (Eq,Show)
|
||||
|
||||
scanTokens :: String -> [Token]
|
||||
scanTokens = alexScanTokens
|
||||
```
|
||||
|
||||
The token definition is list of function definitions mapping atomic character
|
||||
and alphabetical sequences to constructors for our ``Token`` datatype.
|
||||
|
||||
|
||||
```haskell
|
||||
%wrapper "basic"
|
||||
|
||||
$digit = 0-9
|
||||
$alpha = [a-zA-Z]
|
||||
$eol = [\n]
|
||||
|
||||
tokens :-
|
||||
|
||||
-- Whitespace insensitive
|
||||
$eol ;
|
||||
$white+ ;
|
||||
|
||||
-- Comments
|
||||
"#".* ;
|
||||
|
||||
-- Syntax
|
||||
let { \s -> TokenLet }
|
||||
True { \s -> TokenTrue }
|
||||
False { \s -> TokenFalse }
|
||||
in { \s -> TokenIn }
|
||||
$digit+ { \s -> TokenNum (read s) }
|
||||
"->" { \s -> TokenArrow }
|
||||
\= { \s -> TokenEq }
|
||||
\\ { \s -> TokenLambda }
|
||||
[\+] { \s -> TokenAdd }
|
||||
[\-] { \s -> TokenSub }
|
||||
[\*] { \s -> TokenMul }
|
||||
\( { \s -> TokenLParen }
|
||||
\) { \s -> TokenRParen }
|
||||
$alpha [$alpha $digit \_ \']* { \s -> TokenSym s }
|
||||
```
|
||||
|
||||
Happy
|
||||
-----
|
||||
|
||||
We'll parse into a small untyped lambda calculus for our frontend language.
|
||||
|
||||
```haskell
|
||||
module Syntax where
|
||||
|
||||
type Name = String
|
||||
|
||||
data Expr
|
||||
= Lam Name Expr
|
||||
| App Expr Expr
|
||||
| Var Name
|
||||
| Lit Lit
|
||||
| Op Binop Expr Expr
|
||||
deriving (Eq,Show)
|
||||
|
||||
data Lit
|
||||
= LInt Int
|
||||
| LBool Bool
|
||||
deriving (Show, Eq, Ord)
|
||||
|
||||
data Binop = Add | Sub | Mul | Eql
|
||||
deriving (Eq, Ord, Show)
|
||||
```
|
||||
|
||||
The token constructors are each assigned to a name that will be used in our
|
||||
production rules.
|
||||
|
||||
```haskell
|
||||
-- Lexer structure
|
||||
%tokentype { Token }
|
||||
|
||||
-- Token Names
|
||||
%token
|
||||
let { TokenLet }
|
||||
true { TokenTrue }
|
||||
false { TokenFalse }
|
||||
in { TokenIn }
|
||||
NUM { TokenNum $$ }
|
||||
VAR { TokenSym $$ }
|
||||
'\\' { TokenLambda }
|
||||
'->' { TokenArrow }
|
||||
'=' { TokenEq }
|
||||
'+' { TokenAdd }
|
||||
'-' { TokenSub }
|
||||
'*' { TokenMul }
|
||||
'(' { TokenLParen }
|
||||
')' { TokenRParen }
|
||||
```
|
||||
|
||||
The parser itself will live inside of a custom monad of our choosing. In this
|
||||
simple case we'll just add error handling with the ``Except`` monad.
|
||||
|
||||
```haskell
|
||||
-- Parser monad
|
||||
%monad { Except String } { (>>=) } { return }
|
||||
%error { parseError }
|
||||
```
|
||||
|
||||
And finally our production rules, the toplevel entry point for our parser will
|
||||
be the ``expr`` rule. Notice how naturally we can right left recursive grammar
|
||||
for our infix operators.
|
||||
|
||||
```haskell
|
||||
-- Entry point
|
||||
%name expr
|
||||
|
||||
-- Operators
|
||||
%left '+' '-'
|
||||
%left '*'
|
||||
%%
|
||||
|
||||
Expr : let VAR '=' Expr in Expr { App (Lam $2 $6) $4 }
|
||||
| '\\' VAR '->' Expr { Lam $2 $4 }
|
||||
| Form { $1 }
|
||||
|
||||
Form : Form '+' Form { Op Add $1 $3 }
|
||||
| Form '-' Form { Op Sub $1 $3 }
|
||||
| Form '*' Form { Op Mul $1 $3 }
|
||||
| Fact { $1 }
|
||||
|
||||
Fact : Fact Atom { App $1 $2 }
|
||||
| Atom { $1 }
|
||||
|
||||
Atom : '(' Expr ')' { $2 }
|
||||
| NUM { Lit (LInt $1) }
|
||||
| VAR { Var $1 }
|
||||
| true { Lit (LBool True) }
|
||||
| false { Lit (LBool True) }
|
||||
```
|
||||
|
||||
|
||||
Type Provenance
|
||||
---------------
|
||||
|
||||
We will use a technique of track the "flow" of type information through out
|
||||
typechecker and associate position information associated with the inferred
|
||||
types back to their position information in the source.
|
||||
|
||||
Indentation
|
||||
-----------
|
||||
|
||||
Haskell's syntax uses indentation blocks to delineated sections of code. This
|
||||
use of indentation sensitive layout to convey the structure of logic is
|
||||
sometimes called the *offside rule* in parsing literature. At the beginning of
|
||||
"laidout" block the first declaration or definition can start in any column, and
|
||||
the parser marks that indentation level. Every subsequent top-level declaration
|
||||
must have the same indentation.
|
||||
|
||||
|
||||
```haskell
|
||||
-- Start of layout ( Column: 0 )
|
||||
fib :: Int -> Int
|
||||
fib x = truncate $ ( 1 / sqrt 5 ) * ( phi ^ x - psi ^ x ) -- (Column: > 0)
|
||||
-- Start of new layout ( Column: 2 )
|
||||
where
|
||||
-- Indented block ( Column: > 2 )
|
||||
phi = ( 1 + sqrt 5 ) / 2
|
||||
psi = ( 1 - sqrt 5 ) / 2
|
||||
```
|
||||
|
||||
The Parsec monad is itself parameterized over a type variable ``s`` which stands
|
||||
for the State layer baked into the monad allowing us to embed custom parser
|
||||
state inside of our rules. To adopt our parser to handle sensitive whitespace we
|
||||
will
|
||||
|
||||
```haskell
|
||||
-- Indentation sensitive Parsec monad.
|
||||
type IParsec a = Parsec Text ParseState a
|
||||
|
||||
data ParseState = ParseState
|
||||
{ indents :: Column
|
||||
} deriving (Show)
|
||||
|
||||
initParseState :: ParseState
|
||||
initParseState = ParseState 0
|
||||
```
|
||||
|
||||
Inside of the Parsec the internal position state (SourcePos) is stored during
|
||||
each traversal, and is accessible inside of rule logic via ``getPosition``
|
||||
function.
|
||||
|
||||
```haskell
|
||||
data SourcePos = SourcePos SourceName !Line !Column
|
||||
getPosition :: Monad m => ParsecT s u m SourcePos
|
||||
```
|
||||
|
||||
In terms of this function we can write down a set of logic that will allow us to
|
||||
query the current column count and then either succeed or fail to match on a
|
||||
pattern based on the current indentation level. The ``laidout`` combinator will
|
||||
capture the current indentation state and push it into the ``indents`` field in
|
||||
the State monad.
|
||||
|
||||
```haskell
|
||||
laidout :: Parsec s ParseState a -> Parsec s ParseState a
|
||||
laidout m = do
|
||||
cur <- indents <$> getState
|
||||
pos <- sourceColumn <$> getPosition
|
||||
modifyState $ \st -> st { indents = pos }
|
||||
res <- m
|
||||
modifyState $ \st -> st { indents = cur }
|
||||
return res
|
||||
```
|
||||
|
||||
And then have specific logic which guard the parser match based on comparing the
|
||||
current indentation level to the stored indentation level.
|
||||
|
||||
```haskell
|
||||
indentCmp
|
||||
:: (Column -> Column -> Bool)
|
||||
-> Parsec s ParseState ()
|
||||
indentCmp cmp = do
|
||||
col <- sourceColumn <$> getPosition
|
||||
current <- indents <$> getState
|
||||
guard (col `cmp` current)
|
||||
```
|
||||
|
||||
We can then write two combinators in terms of this function which match on
|
||||
either positive and identical indentation difference.
|
||||
|
||||
```haskell
|
||||
indented :: IParsec ()
|
||||
indented = indentCmp (>) <?> "Block (indented)"
|
||||
|
||||
align :: IParsec ()
|
||||
align = indentCmp (==) <?> "Block (same indentation)"
|
||||
```
|
||||
|
||||
On top of these we write our two combinators for handling block syntax, which
|
||||
match a sequence of vertically aligned patterns as a list.
|
||||
|
||||
```haskell
|
||||
block, block1 :: Parser a -> Parser [a]
|
||||
block p = laidout (many (align >> p))
|
||||
block1 p = laidout (many1 (align >> p))
|
||||
```
|
||||
|
||||
GHC uses an optional layout rule for several constructs, allowing us to
|
||||
equivalently manually delimit indentation sensitive syntax with braces. The most
|
||||
common is for do-notation. So for example:
|
||||
|
||||
```haskell
|
||||
example = do { a <- m; b }
|
||||
|
||||
example = do
|
||||
a <- m
|
||||
b
|
||||
```
|
||||
|
||||
To support this in Parsec style we adopt implement a ``maybeBraces`` function.
|
||||
|
||||
```haskell
|
||||
maybeBraces :: Parser a -> Parser [a]
|
||||
maybeBraces p = braces (endBy p semi) <|> block p
|
||||
|
||||
maybeBraces1 :: Parser a -> Parser [a]
|
||||
maybeBraces1 p = braces (endBy1 p semi) <|> block p
|
||||
```
|
||||
|
||||
Error Reporting
|
||||
---------------
|
||||
|
||||
Parsec's default error reporting leaves a bit to be desired, but does in fact
|
||||
contain most of the information needed to deliver better messages packed inside
|
||||
the ParseError structure.
|
||||
|
||||
```haskell
|
||||
showSyntaxError :: L.Text -> ParseError -> String
|
||||
showSyntaxError s err = L.unpack $ L.unlines [
|
||||
" ",
|
||||
" " <> lineContents,
|
||||
" " <> ((L.replicate col " ") <> "^"),
|
||||
(L.pack $ show err)
|
||||
]
|
||||
where
|
||||
lineContents = (L.lines s) !! line
|
||||
pos = errorPos err
|
||||
line = sourceLine pos - 1
|
||||
col = fromIntegral $ sourceColumn pos - 1
|
||||
```
|
||||
|
||||
Now when we enter an invalid expression the error reporting will point us
|
||||
directly to the adjacent lexeme that caused the problem as is common in many
|
||||
languages.
|
||||
|
||||
```bash
|
||||
λ> \x -> x +
|
||||
|
||||
\x -> x +
|
||||
^
|
||||
"<interactive>" (line 1, column 11):
|
||||
unexpected end of input
|
||||
expecting "(", character, literal string, "[", integer, "if" or identifier
|
||||
```
|
||||
|
||||
Extensible Operators
|
||||
--------------------
|
||||
|
||||
Haskell famously allows the definition of custom infix operators, and extremely
|
||||
useful language feature although this poses a bit of a challenge to parse! There
|
||||
are several ways to do this and both depend on two properties of the operators.
|
||||
|
||||
* Precedence
|
||||
* Associativity
|
||||
|
||||
The first is the way that GHC does is to parse all operators as left associative
|
||||
and of the same precedence, and then before desugaring go back and "fix" the
|
||||
parse tree given all the information we collected after finishing parsing.
|
||||
|
||||
The second method is a bit of a hack, and involves simply storing the collected
|
||||
operators inside of the Parsec state monad and then simply calling
|
||||
``buildExpressionParser`` on the current state each time we want to parse and
|
||||
infix operator expression.
|
||||
|
||||
```haskell
|
||||
data FixitySpec = FixitySpec
|
||||
{ fixityFix :: Fixity
|
||||
, fixityName :: String
|
||||
} deriving (Eq, Show)
|
||||
|
||||
data Assoc
|
||||
= L
|
||||
| R
|
||||
| N
|
||||
deriving (Eq,Ord,Show)
|
||||
|
||||
data Fixity
|
||||
= Infix Assoc Int
|
||||
| Prefix Int
|
||||
| Postfix Int
|
||||
deriving (Eq,Ord,Show)
|
||||
```
|
||||
|
||||
In our parser:
|
||||
|
||||
```haskell
|
||||
fixityPrec :: FixitySpec -> Int
|
||||
fixityPrec (FixitySpec (Infix _ n) _) = n
|
||||
fixityPrec (FixitySpec _ _) = 0
|
||||
|
||||
mkTable ops =
|
||||
map (map toParser) $
|
||||
groupBy ((==) `on` fixityPrec) $
|
||||
reverse $ sortBy (compare `on` fixityPrec) $ ops
|
||||
|
||||
toParser (FixitySpec ass tok) = case ass of
|
||||
Infix L _ -> infixOp tok (op (Name tok)) Ex.AssocLeft
|
||||
Infix R _ -> infixOp tok (op (Name tok)) Ex.AssocRight
|
||||
Infix N _ -> infixOp tok (op (Name tok)) Ex.AssocNone
|
||||
```
|
||||
|
||||
```haskell
|
||||
data ParseState = ParseState
|
||||
{ indents :: Column
|
||||
, fixities :: [FixitySpec]
|
||||
} deriving (Show)
|
||||
|
||||
initParseState :: ParseState
|
||||
initParseState = ParseState 0 defaultOps
|
||||
|
||||
defaultOps :: [FixitySpec]
|
||||
defaultOps = [
|
||||
FixitySpec (Infix L 4) ">"
|
||||
, FixitySpec (Infix L 4) "<"
|
||||
, FixitySpec (Infix L 4) "/="
|
||||
, FixitySpec (Infix L 4) "=="
|
||||
|
||||
, FixitySpec (Infix R 5) ":"
|
||||
|
||||
, FixitySpec (Infix L 6) "+"
|
||||
, FixitySpec (Infix L 6) "-"
|
||||
|
||||
, FixitySpec (Infix L 5) "*"
|
||||
, FixitySpec (Infix L 5) "/"
|
||||
]
|
||||
|
||||
addOperator :: FixitySpec -> Parsec s ParseState ()
|
||||
addOperator fixdecl = do
|
||||
modifyState $ \st -> st { fixities = fixdecl : (fixities st) }
|
||||
```
|
||||
|
||||
Now when parsing a infix operator declarations we simply do a state operation
|
||||
and add add the operator.
|
||||
|
||||
```haskell
|
||||
fixityspec :: Parser FixitySpec
|
||||
fixityspec = do
|
||||
fix <- fixity
|
||||
prec <- precedence
|
||||
op <- operator
|
||||
semi
|
||||
let spec = FixitySpec (fix prec) op
|
||||
addOperator spec
|
||||
return spec
|
||||
where
|
||||
fixity = Infix L <$ reserved "infixl"
|
||||
<|> Infix R <$ reserved "infixr"
|
||||
<|> Infix N <$ reserved "infix"
|
||||
|
||||
precedence :: Parser Int
|
||||
precedence = do
|
||||
n <- natural
|
||||
if n <= 10
|
||||
then return (fromInteger n)
|
||||
else empty
|
||||
<?> "Invalid operator precedence"
|
||||
|
||||
fixitydecl :: Parser Decl
|
||||
fixitydecl = do
|
||||
spec <- fixityspec
|
||||
return $ FixityDecl spec
|
||||
<?> "operator fixity definition"
|
||||
```
|
||||
|
||||
Full Source
|
||||
-----------
|
||||
|
||||
* [Happy Parser](https://github.com/sdiehl/write-you-a-haskell/tree/master/chapter9/happy)
|
||||
|
||||
Resources
|
||||
---------
|
||||
|
||||
The tooling and documentation for Alex and Happy is well-developed as it is used
|
||||
extensively inside of GHC:
|
||||
|
||||
* [Alex User Guide](https://www.haskell.org/alex/doc)
|
||||
* [Happy User Guide](https://www.haskell.org/happy/doc/html/)
|
||||
* [A Tool for Generalized LR Parsing In Haskell](http://www.benmedlock.co.uk/Functional_GLR_Parsing.pdf)
|
||||
* [Haskell Syntax Definition](https://www.haskell.org/onlinereport/haskell2010/haskellch10.html#x17-17500010)
|
||||
|
||||
Haskell itself uses Alex and Happy for it's parser infastructure. The resulting
|
||||
parser is rather sophisicated.
|
||||
|
||||
* [Lexer.x](https://github.com/ghc/ghc/blob/master/compiler/parser/Lexer.x)
|
||||
* [Parser.y](https://github.com/ghc/ghc/blob/master/compiler/parser/Parser.y)
|
||||
|
||||
\clearpage
|
45
009_datatypes.md
Normal file
@ -0,0 +1,45 @@
|
||||
<p class="halfbreak">
|
||||
</p>
|
||||
|
||||
Datatypes
|
||||
=========
|
||||
|
||||
Syntax
|
||||
------
|
||||
|
||||
GHC.Generics
|
||||
------------
|
||||
|
||||
```haskell
|
||||
class Generic a where
|
||||
type family Rep a :: * -> *
|
||||
to :: a -> Rep a x
|
||||
from :: Rep a x -> a
|
||||
```
|
||||
|
||||
Constructor Models
|
||||
----------- -------
|
||||
``V1`` Void: used for datatypes without constructors
|
||||
``U1`` Unit: used for constructors without arguments
|
||||
``K1`` Constants, additional parameters.
|
||||
``:*:`` Products: encode multiple arguments to constructors
|
||||
``:+:`` Sums: encode choice between constructors
|
||||
``L1`` Left hand side of a sum.
|
||||
``R1`` Right hand side of a sum.
|
||||
``M1`` Meta-information (constructor names, etc.)
|
||||
|
||||
```haskell
|
||||
newtype M1 i c f p = M1 (f p)
|
||||
newtype K1 i c p = K1 c
|
||||
data U p = U
|
||||
```
|
||||
|
||||
```haskell
|
||||
data (:*:) a b p = a p :*: b p
|
||||
data (:+:) a b p = L1 (a p) | R1 (b p)
|
||||
```
|
||||
|
||||
Full Source
|
||||
-----------
|
||||
|
||||
\clearpage
|
13
010_renamer.md
Normal file
@ -0,0 +1,13 @@
|
||||
<p class="halfbreak">
|
||||
</p>
|
||||
|
||||
Renamer
|
||||
=======
|
||||
|
||||
Uniplate
|
||||
--------
|
||||
|
||||
Full Source
|
||||
-----------
|
||||
|
||||
\clearpage
|
10
011_pattern_matching.md
Normal file
@ -0,0 +1,10 @@
|
||||
<p class="halfbreak">
|
||||
</p>
|
||||
|
||||
Pattern Matching
|
||||
================
|
||||
|
||||
Full Source
|
||||
-----------
|
||||
|
||||
\clearpage
|
274
012_systemf.md
Normal file
@ -0,0 +1,274 @@
|
||||
<div class="pagetitle">
|
||||
![](img/titles/systemf.png)
|
||||
</div>
|
||||
|
||||
<p class="halfbreak">
|
||||
</p>
|
||||
|
||||
> *Conventional programming languages are growing ever more enormous, but not
|
||||
> stronger. Inherent defects at the most basic level cause them to be both fat
|
||||
> and weak...*
|
||||
>
|
||||
> <cite>— John Backus</cite>
|
||||
|
||||
<p class="halfbreak">
|
||||
</p>
|
||||
|
||||
System F
|
||||
========
|
||||
|
||||
The *second order lambda calculus* or System F of Girard and Reynolds is an
|
||||
extension of the simply typed lambda calculus that adds an additional level of
|
||||
lambda abstraction and type application.
|
||||
|
||||
$$
|
||||
\begin{aligned}
|
||||
\tau ::=\ & \tau \to \tau \\
|
||||
& \alpha \\
|
||||
& \forall \overline \alpha . \tau \\
|
||||
\end{aligned}
|
||||
$$
|
||||
|
||||
$$
|
||||
\begin{aligned}
|
||||
e ::=\ & x \\
|
||||
& \lambda x:\tau . e \\
|
||||
& \Lambda \alpha . e \\
|
||||
& e\ e \\
|
||||
& e\ \tau \\
|
||||
\end{aligned}
|
||||
$$
|
||||
|
||||
$$
|
||||
\begin{aligned}
|
||||
\Gamma ::=\ & \varnothing \\
|
||||
& \Gamma,\ x : \tau \\
|
||||
& \Gamma,\ \alpha \\
|
||||
\end{aligned}
|
||||
$$
|
||||
|
||||
Examples
|
||||
--------
|
||||
|
||||
In GHC Core's language the identity function is implemented precisely in this
|
||||
form, with the only exception being that it uses the backslash for both the
|
||||
value-level and type-level lambda. The distinction is that that a $\Lambda$ is
|
||||
parameterized by a type variable (indicated by ``@ t``) while a $\lambda$ is
|
||||
parameterized over value variable.
|
||||
|
||||
```haskell
|
||||
\ (@ t) (x :: t) -> x
|
||||
```
|
||||
|
||||
So for some examples:
|
||||
|
||||
**Identity function**:
|
||||
|
||||
In System F notation:
|
||||
|
||||
$$
|
||||
\begin{aligned}
|
||||
\mathtt{id}\ & ::\ \forall \alpha. \alpha \to \alpha \\
|
||||
\mathtt{id}\ & =\ \Lambda \alpha. \lambda x : \alpha. x
|
||||
\end{aligned}
|
||||
$$
|
||||
|
||||
In GHC Core Notation:
|
||||
|
||||
```haskell
|
||||
id :: forall a. a -> a
|
||||
id = \ (@ a) (x :: a) -> x
|
||||
```
|
||||
|
||||
**Compose function**:
|
||||
|
||||
In System F notation:
|
||||
|
||||
$$
|
||||
\begin{aligned}
|
||||
\mathtt{compose}\ & ::\ \forall \alpha \beta \gamma. (\beta \to \gamma) \to (\alpha \to \beta) \to \alpha \to \gamma \\
|
||||
\mathtt{compose}\ & =\
|
||||
\Lambda \alpha \beta \gamma.\
|
||||
\ \lambda \ (f: \beta \to \gamma)
|
||||
\ (g: \alpha \to \beta)
|
||||
\ (x: \alpha).
|
||||
\ f (g\ x)
|
||||
\end{aligned}
|
||||
$$
|
||||
|
||||
In GHC Core Notation:
|
||||
|
||||
```haskell
|
||||
compose :: forall a b c. (b -> c) -> (a -> b) -> a -> c
|
||||
compose =
|
||||
\ (@ b) (@ c) (@ a)
|
||||
(f :: b -> c) (g :: a -> b) (x :: a)
|
||||
-> f (g x)
|
||||
```
|
||||
|
||||
Datatypes
|
||||
---------
|
||||
|
||||
Previously when working with our Hindley-Milner type system we've had to "bolt
|
||||
on" the primitive datatypes into the language itself. Now in System F primitive
|
||||
datatypes are actually definable.
|
||||
|
||||
$$ \t{Pair} \ u \ v = \Lambda X . \lambda x^{U \rightarrow V \rightarrow X} . x u v $$
|
||||
|
||||
Rules
|
||||
-----
|
||||
|
||||
$$
|
||||
\begin{array}{cl}
|
||||
\infrule{x:\tau \in \Gamma}{\Gamma \vdash x:\tau} & \trule{T-Var} \\ \\
|
||||
\infrule{\Gamma, x : \tau_1 \vdash e : \tau_2}{\Gamma \vdash \lambda x . \tau_2 : e_1 \rightarrow e_2 } & \trule{T-Lam} \\ \\
|
||||
\infrule{\Gamma \vdash e_1 : \tau_1 \rightarrow \tau_2 \andalso \Gamma \vdash e_2 : \tau_1}{\Gamma \vdash e_1 e_2 : \tau_2} & \trule{T-App} \\ \\
|
||||
\infrule{\Gamma, \alpha \vdash x : \tau}{\Gamma \vdash \Lambda \alpha. x : \forall \alpha. \tau } & \trule{T-TAbs} \\ \\
|
||||
\infrule{\Gamma, \alpha \vdash x : \forall \alpha. \tau_1}{\Gamma \vdash x\ \tau_2 : [\alpha / \tau_2] \tau_2} & \trule{T-TApp} \\ \\
|
||||
\end{array}
|
||||
$$
|
||||
|
||||
|
||||
Prenex Restriction
|
||||
------------------
|
||||
|
||||
System F is strictly more general than our previous Hindley-Milner type system,
|
||||
in the sense that every term can be expressed in System F. Implicitly though
|
||||
we've made an assumption about the position of qualifiers can only occur at the
|
||||
front of the type signature in the *prenex position*. So under this restriction
|
||||
we can't write a function which takes a polymorphic function as an argument or
|
||||
returns a polymorphic function as a result.
|
||||
|
||||
Normally when Haskell's typechecker infers a type signature it places all
|
||||
quantifiers of type variables at the outermost position such that that no
|
||||
quantifiers appear within the body of the type expression, called the prenex
|
||||
restriction This restrict an entire class of type signatures that are would
|
||||
otherwise expressible within System F, but has the benefit of making inference
|
||||
tractable.
|
||||
|
||||
```haskell
|
||||
-- Allowed
|
||||
a :: forall a. (a -> a) -> a -> a
|
||||
a f x = f x
|
||||
|
||||
-- Forbidden
|
||||
a :: (forall a. a -> a) -> b -> b
|
||||
a f x = f x
|
||||
|
||||
-- Forbidden
|
||||
a :: a -> (forall b. b -> b)
|
||||
a x = (\x -> x)
|
||||
|
||||
-- Forbidden
|
||||
a :: (forall a. a -> a) -> (forall b. b -> b)
|
||||
a x = x
|
||||
```
|
||||
|
||||
The concept of *polymorphism rank* falls out of this notion. Simply put the
|
||||
level of nesting for the qualifier inside the type specifies the rank of the
|
||||
type signature.
|
||||
|
||||
* Rank 0: ``t``
|
||||
* Rank 1: ``forall a. a -> t``
|
||||
* Rank 2: ``(forall a. a -> t) -> t``
|
||||
* Rank 3: ``((forall a. a -> t) -> t) -> t``
|
||||
|
||||
The term *rank-n polymorphism* indicates the type systems polymorphism of any
|
||||
arbitrary rank, allow the qualifier to appear anywhere exposing the entire
|
||||
expressible space of System F types.
|
||||
|
||||
In GHC's implementation a lambda-bound or case-bound variable the user must
|
||||
provide an explicit annotation or GHC's type inference will assume that the type
|
||||
has no ``forall``'s in it and must have a principal Rank-1 type which may not
|
||||
exist.
|
||||
|
||||
```haskell
|
||||
{-# LANGUAGE RankNTypes #-}
|
||||
|
||||
-- No annotation: cannot unify Bool with Char
|
||||
example1 f = (f True, f 'c')
|
||||
|
||||
-- Type checks!
|
||||
example2 :: (forall a. a -> a) -> (Bool, Char)
|
||||
example2 f = (f True, f 'c')
|
||||
```
|
||||
|
||||
The language extension ``-XRankNTypes`` loosens the prenex restriction such that
|
||||
we may explicitly place quantifiers within the body of the type. The bad news is
|
||||
that the general problem of inference in this relaxed system is undecidable in
|
||||
general, so we're required to explicitly annotate functions which use
|
||||
``RankNTypes`` or they are otherwise inferred as rank-1 and may not typecheck at
|
||||
all.
|
||||
|
||||
As noted before to fully implement the dictionaries for monad typeclasses we
|
||||
will need at least rank-2 polymorphism so that the functions specified in the
|
||||
``DMonad m`` can be instantiated at arbitrary types for ``a`` and ``b``.
|
||||
|
||||
```haskell
|
||||
{-# LANGUAGE RankNTypes #-}
|
||||
|
||||
data DMonad m = DMonad
|
||||
{ return :: forall a. a -> m a
|
||||
, bind :: forall a b. m a -> (a -> m b) -> m b
|
||||
}
|
||||
|
||||
data Maybe a = Nothing | Just a
|
||||
|
||||
bindMaybe :: Maybe a -> (a -> Maybe b) -> Maybe b
|
||||
bindMaybe (Just x) f = f x
|
||||
bindMaybe Nothing f = Nothing
|
||||
|
||||
returnMaybe :: a -> Maybe a
|
||||
returnMaybe x = Just x
|
||||
|
||||
-- Maybe monad explicit dictionary.
|
||||
maybe :: DMonad Maybe
|
||||
maybe = DMonad
|
||||
{ bind = bindMaybe
|
||||
, return = returnMaybe
|
||||
}
|
||||
```
|
||||
|
||||
Type Variables
|
||||
--------------
|
||||
|
||||
Categories of types
|
||||
|
||||
* Rho-types $\sigma$
|
||||
* Polytypes $\rho$
|
||||
* Monotypes $\tau$
|
||||
|
||||
Categories of type variables
|
||||
|
||||
* Meta type variables: $\tau_1, \tau_2$
|
||||
* Bound type variables: $a, b$
|
||||
* Skolem type variables
|
||||
|
||||
Substitution
|
||||
------------
|
||||
|
||||
Erasure
|
||||
-------
|
||||
|
||||
The type erasure:
|
||||
|
||||
$$
|
||||
\begin{aligned}[lcl]
|
||||
& \t{erase}(x) &=&\ x \\
|
||||
& \t{erase}(e_1\ e_2) &=&\ \t{erase}(e_1)\ \t{erase}(e_2) \\
|
||||
& \t{erase}(\lambda x:t.e) &=&\ \lambda x . \t{erase}(e) \\
|
||||
& \t{erase}(\Lambda \alpha . e) &=&\ \t{erase}(e) \\
|
||||
& \t{erase}(e\ t) &=&\ \t{erase}(e) \\
|
||||
\end{aligned}
|
||||
$$
|
||||
|
||||
Evaluation
|
||||
----------
|
||||
|
||||
unbound-generics
|
||||
----------------
|
||||
|
||||
Up until now we've been writing our own binding implementation. There is however
|
||||
a better way
|
||||
|
||||
\clearpage
|
693
026_llvm.md
Normal file
@ -0,0 +1,693 @@
|
||||
<div class="pagetitle">
|
||||
![](img/titles/llvm.png)
|
||||
</div>
|
||||
|
||||
<p class="halfbreak">
|
||||
</p>
|
||||
|
||||
<!--
|
||||
> *Conventional programming languages are growing ever more enormous, but not
|
||||
> stronger. Inherent defects at the most basic level cause them to be both fat
|
||||
> and weak...*
|
||||
>
|
||||
> <cite>— John Backus</cite>
|
||||
-->
|
||||
|
||||
<p class="halfbreak">
|
||||
</p>
|
||||
|
||||
|
||||
LLVM
|
||||
====
|
||||
|
||||
LLVM is a statically typed intermediate representation and an associated
|
||||
toolchain for manipulating, optimizing and converting this intermediate form
|
||||
into native code.
|
||||
|
||||
So for example consider a simple function which takes two arguments, adds them,
|
||||
and xors the result. Writing in IR it would be formed as such:
|
||||
|
||||
~~~~ {.haskell include="chapter27/example.ll"}
|
||||
~~~~
|
||||
|
||||
Running this through the LLVM toolchain we can target our high level IR into
|
||||
multiple different assembly codes mapping onto various architectures and CPUs
|
||||
all from the same platform agnostic intermediate representation.
|
||||
|
||||
**x86-64**
|
||||
|
||||
```perl
|
||||
test1:
|
||||
.cfi_startproc
|
||||
andl %edx, %esi
|
||||
andl %edx, %edi
|
||||
xorl %esi, %edi
|
||||
movl %edi, %eax
|
||||
ret
|
||||
```
|
||||
|
||||
**ARM**
|
||||
|
||||
```perl
|
||||
test1:
|
||||
and r1, r2, r1
|
||||
and r0, r2, r0
|
||||
eor r0, r0, r1
|
||||
mov pc, lr
|
||||
```
|
||||
|
||||
**PowerPC**
|
||||
|
||||
```perl
|
||||
.L.test1:
|
||||
.cfi_startproc
|
||||
and 4, 5, 4
|
||||
and 3, 5, 3
|
||||
xor 3, 3, 4
|
||||
blr
|
||||
.long 0
|
||||
.quad 0
|
||||
```
|
||||
|
||||
A uncommonly large amount of hardware manufacturers and software vendors (Adobe,
|
||||
AMD, Apple, ARM, Google, IBM, Intel, Mozilla, Qualcomm, Samsung, Xilinx) have
|
||||
come have converged on the LLVM toolchain as service agnostic way to talk about
|
||||
generating machine code.
|
||||
|
||||
What's even more impressive is that many of the advances in compiler
|
||||
optimizations and static analysis have been mechanized in the form of
|
||||
optimization passes so that all compilers written on top of the LLVM platform
|
||||
can take advantage of the same advanced optimizers that would often previously
|
||||
have to be developed independently.
|
||||
|
||||
Types
|
||||
-----
|
||||
|
||||
**Primitive**
|
||||
|
||||
```perl
|
||||
i1 ; Boolean type
|
||||
i8 ; char
|
||||
i32 ; 32 bit integer
|
||||
i64 ; 64 bit integer
|
||||
float ; 32 bit
|
||||
double ; 64 bit
|
||||
```
|
||||
|
||||
**Arrays**
|
||||
|
||||
```perl
|
||||
[10 x float] ; Array of 10 floats
|
||||
[10 x [20 x i32]] ; Array of 10 arrays of 20 integers.
|
||||
```
|
||||
|
||||
**Structs**
|
||||
|
||||
```perl
|
||||
{float, i64} ; structure
|
||||
{float, {double, i3}} ; nested structure
|
||||
<{float, [2 x i3]}> ; packed structure
|
||||
```
|
||||
|
||||
**Vectors**
|
||||
|
||||
```perl
|
||||
<4 x double>
|
||||
<8 x float>
|
||||
```
|
||||
|
||||
**Pointers**
|
||||
|
||||
```perl
|
||||
float* ; Pointer to a float
|
||||
[25 x float]* ; Pointer to an array
|
||||
```
|
||||
|
||||
The traditional ``void*`` pointer in C is a ``i8*`` pointer in LLVM with the
|
||||
appropriate casts.
|
||||
|
||||
**Constants**
|
||||
|
||||
```perl
|
||||
[i1 true, i1 false] ; constant bool array
|
||||
<i32 42, i32 10> ; constant vector
|
||||
float 1.23421e+2 ; floating point constant
|
||||
null ; null pointer constant
|
||||
```
|
||||
|
||||
The ``zeroinitializer`` can be used to instantiate any type to the appropriate
|
||||
zero of any type.
|
||||
|
||||
```perl
|
||||
<8 x float> zeroinitializer ; Zero vector
|
||||
```
|
||||
|
||||
**Named Types**
|
||||
|
||||
```perl
|
||||
%vec4 = type <4 x i32>
|
||||
%pair = type { i32, i32 }
|
||||
```
|
||||
|
||||
Recursive types declarations are supported.
|
||||
|
||||
```perl
|
||||
%f = type { %f*, i32 }
|
||||
```
|
||||
|
||||
**Platform Information**
|
||||
|
||||
|
||||
```perl
|
||||
target datalayout = "
|
||||
e-
|
||||
p : 64 : 64 : 64-
|
||||
i1 : 8 : 8-
|
||||
i8 : 8 : 8-
|
||||
i16 : 16 : 16-
|
||||
i32 : 32 : 32-
|
||||
i64 : 64 : 64-
|
||||
f32 : 32 : 32-
|
||||
f64 : 64 : 64-
|
||||
v64 : 64 : 64-
|
||||
v128 : 128 : 128-
|
||||
a0 : 0 : 64-
|
||||
s0 : 64 : 64-
|
||||
f80 : 128 : 128-
|
||||
n8 : 16 : 32 : 64-
|
||||
S128
|
||||
"
|
||||
target triple = "x86_64-unknown-linux-gnu"
|
||||
```
|
||||
|
||||
Specifications are delimited by the minus sign ``-``.
|
||||
|
||||
* The ``e`` indicates the platform is little-endian.
|
||||
* The ``i<n>`` indicate the bitsize and alignment of the integer type.
|
||||
* The ``f<n>`` indicate the bitsize and alignment of the floating point type.
|
||||
* The ``p<n>`` indicate the bitsize and alignment of the pointer type.
|
||||
* The ``v<n>`` indicate the bitsize and alignment of the vector type.
|
||||
* The ``a<n>`` indicate the bitsize and alignment of the aggregate type.
|
||||
* The ``n<n>`` indicate the widths of the CPU registers.
|
||||
* The ``S<n>`` indicate the alignment of the stack.
|
||||
|
||||
Variables
|
||||
---------
|
||||
|
||||
Symbols used in an LLVM module are either global or local. Global symbols begin
|
||||
with ``@`` and local symbols begin with ``%``. All symbols must be defined or
|
||||
forward declared.
|
||||
|
||||
Instructions in LLVM are either numbered sequentially (``%0``, ``%1``, ...) or
|
||||
given explicit variable names (``%a``, ``%foo``, ..). For example the arguments
|
||||
to the following function are named values, while the result of the add
|
||||
instructions unnamed.
|
||||
|
||||
```perl
|
||||
define i32 @add(i32 %a, i32 %b) {
|
||||
%1 = add i32 %a, %b
|
||||
ret i32 %1
|
||||
}
|
||||
```
|
||||
|
||||
Instructions
|
||||
------------
|
||||
|
||||
```perl
|
||||
%result = add i32 10, 20
|
||||
```
|
||||
|
||||
**Logical**
|
||||
|
||||
* ``shl``
|
||||
* ``lshr``
|
||||
* ``ashr``
|
||||
* ``and``
|
||||
* ``or``
|
||||
* ``xor``
|
||||
|
||||
**Binary Operators**
|
||||
|
||||
* ``add``
|
||||
* ``fadd``
|
||||
* ``sub``
|
||||
* ``fsub``
|
||||
* ``mul``
|
||||
* ``fmul``
|
||||
* ``udiv``
|
||||
* ``sdiv``
|
||||
* ``fdiv``
|
||||
* ``urem``
|
||||
* ``srem``
|
||||
* ``frem``
|
||||
|
||||
**Comparison**
|
||||
|
||||
op unsigned signed floating
|
||||
-- --------- -------- --------
|
||||
lt ULT SLT OLT
|
||||
gt UGT SGT OGT
|
||||
le ULE SLE OLE
|
||||
ge UGE SGE OGE
|
||||
eq EQ EQ OEQ
|
||||
ne NE NE ONE
|
||||
|
||||
```perl
|
||||
%c = udiv i32 %a, %b
|
||||
%d = sdiv i32 %a, %b
|
||||
%e = fmul float %a, %b
|
||||
%f = fdiv float %a, %b
|
||||
```
|
||||
|
||||
```perl
|
||||
%g = icmp eq i32 %a, %b
|
||||
%i = icmp slt i32 %a, %b
|
||||
%j = icmp ult i32 %a, %b
|
||||
%k = fcmp olt float, %a, %b
|
||||
```
|
||||
|
||||
Data
|
||||
----
|
||||
|
||||
```perl
|
||||
i1 1
|
||||
i32 299792458
|
||||
float 7.29735257e-3
|
||||
double 6.62606957e-34
|
||||
```
|
||||
|
||||
Blocks
|
||||
------
|
||||
|
||||
Function definitions in LLVM introduce a sequence of labeled *basic blocks*
|
||||
containing any number of instructions and a final *terminator* instruction which
|
||||
indicates how control flow yields after the instructions of the basic block are
|
||||
evaluated.
|
||||
|
||||
```perl
|
||||
define i1 @foo() {
|
||||
entry:
|
||||
br label %next
|
||||
next:
|
||||
br label %return
|
||||
return:
|
||||
ret i1 0
|
||||
}
|
||||
```
|
||||
|
||||
A basic block has either zero (for entry block) or a fixed number of
|
||||
*predecessors*. A graph with basic blocks as nodes and the predecessors of each
|
||||
basic block as edges constitutes a *control flow graph*. LLVM's opt command can
|
||||
be used to dump this graph using graphviz.
|
||||
|
||||
```bash
|
||||
$ opt -view-cfg module.ll
|
||||
$ dot -Tpng module.dot -o module.png
|
||||
```
|
||||
|
||||
We say a basic block A *dominates* a different block B in the control flow
|
||||
graph if it's impossible to reach B without passing through "A, equivalently
|
||||
A is the *dominator* of B.
|
||||
|
||||
All logic in LLVM is written in *static single assignment* (SSA) form. Each
|
||||
variable is assigned precisely once, and every variable is defined before it is
|
||||
used. Updating any existing variable reference creates a new reference with for
|
||||
the resulting output.
|
||||
|
||||
Control Flow
|
||||
------------
|
||||
|
||||
* Unconditional Branch
|
||||
* Conditional Branch
|
||||
* Switch
|
||||
* Return
|
||||
* Phi
|
||||
|
||||
\clearpage
|
||||
|
||||
**Return**
|
||||
|
||||
The ``ret`` function simply exits the current function yielding the current
|
||||
value to the virtual stack.
|
||||
|
||||
```perl
|
||||
define i1 @foo() {
|
||||
ret i1 0
|
||||
}
|
||||
```
|
||||
|
||||
![](chapter27/cfg/ret.png)
|
||||
|
||||
\clearpage
|
||||
|
||||
**Unconditional Branch**
|
||||
|
||||
The unconditional branch ``br`` simply jumps to any basic block local to the
|
||||
function.
|
||||
|
||||
```perl
|
||||
define i1 @foo() {
|
||||
entry:
|
||||
br label %next
|
||||
next:
|
||||
br label %return
|
||||
return:
|
||||
ret i1 0
|
||||
}
|
||||
```
|
||||
|
||||
![](chapter27/cfg/branch.png)
|
||||
|
||||
\clearpage
|
||||
|
||||
**Conditional Branch**
|
||||
|
||||
The conditional branch ``br`` jumps to one of two basic blocks based on whether
|
||||
a test condition is ``true`` or ``false``. This corresponds the logic of a
|
||||
traditional "if statement".
|
||||
|
||||
```perl
|
||||
define i32 @foo() {
|
||||
start:
|
||||
br i1 true, label %left, label %right
|
||||
left:
|
||||
ret i32 10
|
||||
right:
|
||||
ret i32 20
|
||||
}
|
||||
```
|
||||
|
||||
![](chapter27/cfg/cbranch.png)
|
||||
|
||||
\clearpage
|
||||
|
||||
**Switch**
|
||||
|
||||
The switch statement ``switch`` jumps to any number of branches based on the
|
||||
equality of value to a jump table matching values to basic blocks.
|
||||
|
||||
```perl
|
||||
define i32 @foo(i32 %a) {
|
||||
entry:
|
||||
switch i32 %a, label %default [
|
||||
i32 0, label %f
|
||||
i32 1, label %g
|
||||
i32 2, label %h
|
||||
]
|
||||
f:
|
||||
ret i32 1
|
||||
g:
|
||||
ret i32 2
|
||||
h:
|
||||
ret i32 3
|
||||
default:
|
||||
ret i32 0
|
||||
}
|
||||
```
|
||||
|
||||
![](chapter27/cfg/switch.png)
|
||||
|
||||
\clearpage
|
||||
|
||||
**Phi**
|
||||
|
||||
A ``phi`` node selects a value based on the predecessor of the current block.
|
||||
|
||||
```perl
|
||||
define i32 @foo() {
|
||||
start:
|
||||
br i1 true, label %left, label %right
|
||||
left:
|
||||
%plusOne = add i32 0, 1
|
||||
br label %merge
|
||||
right:
|
||||
br label %merge
|
||||
merge:
|
||||
%join = phi i32 [ %plusOne, %left ], [ -1, %right ]
|
||||
ret i32 %join
|
||||
}
|
||||
```
|
||||
|
||||
![](chapter27/cfg/phi.png)
|
||||
|
||||
\clearpage
|
||||
|
||||
**Loops**
|
||||
|
||||
The traditional ``while`` and ``for`` loops can be written in terms of the
|
||||
simpler conditional branching constructrs. For example in C we would write:
|
||||
|
||||
```cpp
|
||||
int count(int n)
|
||||
{
|
||||
int i = 0;
|
||||
while(i < n)
|
||||
{
|
||||
i++;
|
||||
}
|
||||
return i;
|
||||
}
|
||||
```
|
||||
|
||||
Whereas in LLVM we write:
|
||||
|
||||
```perl
|
||||
define i32 @count(i32 %n) {
|
||||
entry:
|
||||
br label %loop
|
||||
|
||||
loop:
|
||||
%i = phi i32 [ 1, %entry ], [ %nextvar, %loop ]
|
||||
%nextvar = add i32 %i, 1
|
||||
|
||||
%cmptmp = icmp ult i32 %i, %n
|
||||
%booltmp = zext i1 %cmptmp to i32
|
||||
%loopcond = icmp ne i32 %booltmp, 0
|
||||
|
||||
br i1 %loopcond, label %loop, label %afterloop
|
||||
|
||||
afterloop:
|
||||
ret i32 %i
|
||||
}
|
||||
```
|
||||
|
||||
![](chapter27/cfg/for.png)
|
||||
|
||||
\clearpage
|
||||
|
||||
**Select**
|
||||
|
||||
Selects the first value if the test value is true, the second if false.
|
||||
|
||||
```perl
|
||||
%x = select i1 true, i8 10, i8 20 ; gives 10
|
||||
%y = select i1 false, i8 10, i8 20 ; gives 20
|
||||
```
|
||||
|
||||
Calls
|
||||
-----
|
||||
|
||||
* ``ccc``: The C calling convention
|
||||
* ``fastcc``: The fast calling convention
|
||||
|
||||
```perl
|
||||
%result = call i32 @exp(i32 7)
|
||||
```
|
||||
|
||||
Memory
|
||||
------
|
||||
|
||||
LLVM uses the traditional load/store model:
|
||||
|
||||
* ``load``: Load a typed value from a given reference
|
||||
* ``store``: Store a typed value in a given reference
|
||||
* ``alloca``: Allocate a pointer to memory on the virtual stack
|
||||
|
||||
```perl
|
||||
%ptr = alloca i32
|
||||
store i32 3, i32* %ptr
|
||||
%val = load i32* %ptr
|
||||
```
|
||||
|
||||
Specific pointer alignment can be specified:
|
||||
|
||||
```perl
|
||||
%ptr = alloca i32, align 1024
|
||||
```
|
||||
|
||||
For allocating in main memory we use an external reference to the C stdlib
|
||||
memory allocator which gives us back a (``i8*``).
|
||||
|
||||
```haskell
|
||||
%ptr = call i8* @malloc(i32 %objectsize)
|
||||
```
|
||||
|
||||
For structures:
|
||||
|
||||
```perl
|
||||
extractvalue {i32, float} %a, 0 ; gives i32
|
||||
extractvalue {i32, {float, double}} %a, 0, 1 ; gives double
|
||||
extractvalue [2 x i32] %a, 0 ; yields i32
|
||||
```
|
||||
|
||||
```perl
|
||||
%x = insertvalue {i32, float} %b, float %val, 1 ; gives {i32 1, float %b}
|
||||
%y = insertvalue {i32, float} zeroinitializer, i32 1, 0 ; gives {i32 1, float 0}
|
||||
```
|
||||
|
||||
GetElementPtr
|
||||
-------------
|
||||
|
||||
Casts
|
||||
-----
|
||||
|
||||
* ``trunc``
|
||||
* ``zext``
|
||||
* ``sext``
|
||||
* ``fptoui``
|
||||
* ``fptosi``
|
||||
* ``uitofp``
|
||||
* ``sitofp``
|
||||
* ``fptrunc``
|
||||
* ``fpext``
|
||||
* ``ptrtoint``
|
||||
* ``inttoptr``
|
||||
* ``bitcast``
|
||||
|
||||
```haskell
|
||||
trunc i32 257 to i8 ; yields i8 1
|
||||
zext i32 257 to i64 ; yields i64 257
|
||||
sext i8 -1 to i16 ; yields i16 65535
|
||||
bitcast <2 x i32> %a to i64 ; yields i64 %a
|
||||
```
|
||||
|
||||
Toolchain
|
||||
---------
|
||||
|
||||
```bash
|
||||
$ llc example.ll -o example.s # compile
|
||||
$ lli example.ll # execute
|
||||
$ opt -S example.bc -o example.ll # to assembly
|
||||
$ opt example.ll -o example.bc # to bitcode
|
||||
$ opt -O3 example.ll -o example.opt.ll -S # run optimizer
|
||||
```
|
||||
|
||||
Individual modules can be linked together.
|
||||
|
||||
```bash
|
||||
$ llvm-link a.ll b.ll -o c.ll -S
|
||||
```
|
||||
|
||||
Link time optimization.
|
||||
|
||||
```bash
|
||||
$ clang -O4 -emit-llvm a.c -c -o a.bc
|
||||
$ clang -O4 -emit-llvm a.c -c -o a.bc
|
||||
$ llvm-link a.bc b.bc -o all.bc
|
||||
$ opt -std-compile-opts -std-link-opts -O3 all.bc -o optimized.bc
|
||||
```
|
||||
|
||||
The clang project is a C compiler that targets LLVM as it's intermediate
|
||||
representation. In the case where we'd like to know how some specific C
|
||||
construct maps into LLVM IR we can ask clang to dump its internal IR using the
|
||||
``-emit-llvm`` flag.
|
||||
|
||||
```perl
|
||||
# clang -emit-llvm -S add.c -o -
|
||||
int add(int x)
|
||||
{
|
||||
return x+1;
|
||||
}
|
||||
```
|
||||
|
||||
```perl
|
||||
; ModuleID = 'add.c'
|
||||
define i32 @add(i32 %x) nounwind uwtable {
|
||||
entry:
|
||||
%x.addr = alloca i32, align 4
|
||||
store i32 %x, i32* %x.addr, align 4
|
||||
%0 = load i32* %x.addr, align 4
|
||||
%add = add nsw i32 %0, 1
|
||||
ret i32 %add
|
||||
}
|
||||
```
|
||||
|
||||
LLVM is using a C++ API underneath the hood of all these tools. If you need to
|
||||
work directly with the API it can be useful to be able to expand out the LLVM IR
|
||||
into the equivalent C++ code.
|
||||
|
||||
```bash
|
||||
$ llc example.ll -march=cpp -o -
|
||||
```
|
||||
|
||||
~~~~ {.haskell include="chapter27/example.ll"}
|
||||
~~~~
|
||||
|
||||
```cpp
|
||||
Function* func_test1 = mod->getFunction("test1");
|
||||
if (!func_test1) {
|
||||
func_test1 = Function::Create(
|
||||
/*Type=*/FuncTy_0,
|
||||
/*Linkage=*/GlobalValue::ExternalLinkage,
|
||||
/*Name=*/"test1", mod);
|
||||
func_test1->setCallingConv(CallingConv::C);
|
||||
}
|
||||
AttrListPtr func_test1_PAL;
|
||||
func_test1->setAttributes(func_test1_PAL);
|
||||
|
||||
{
|
||||
Function::arg_iterator args = func_test1->arg_begin();
|
||||
Value* int32_x = args++;
|
||||
int32_x->setName("x");
|
||||
Value* int32_y = args++;
|
||||
int32_y->setName("y");
|
||||
Value* int32_z = args++;
|
||||
int32_z->setName("z");
|
||||
|
||||
BasicBlock* label_1 = BasicBlock::Create(mod->getContext(), "",func_test1,0);
|
||||
|
||||
BinaryOperator* int32_a = BinaryOperator::Create(
|
||||
Instruction::And, int32_z, int32_x, "a", label_1);
|
||||
BinaryOperator* int32_b = BinaryOperator::Create(
|
||||
Instruction::And, int32_z, int32_y, "b", label_1);
|
||||
BinaryOperator* int32_c = BinaryOperator::Create(
|
||||
Instruction::Xor, int32_a, int32_b, "c", label_1);
|
||||
ReturnInst::Create(mod->getContext(), int32_c, label_1);
|
||||
|
||||
}
|
||||
```
|
||||
|
||||
llvm-general
|
||||
------------
|
||||
|
||||
The LLVM bindings for Haskell are split across two packages:
|
||||
|
||||
* **llvm-general-pure** is a pure Haskell representation of the LLVM IR.
|
||||
|
||||
* **llvm-general** is the FFI bindings to LLVM required for constructing the C representation of the
|
||||
LLVM IR and performing optimization and compilation.
|
||||
|
||||
llvm-general-pure does not require the LLVM libraries be available on the system.
|
||||
|
||||
GHCi can have issues with the FFI and can lead to errors when working with
|
||||
``llvm-general``. If you end up with errors like the following, then you are
|
||||
likely trying to use ``GHCi`` or ``runhaskell`` and it is unable to link against
|
||||
your LLVM library. Instead compile with standalone ``ghc``.
|
||||
|
||||
```bash
|
||||
Loading package llvm-general-3.3.8.2
|
||||
... linking
|
||||
... ghc: /usr/lib/llvm-3.3/lib/libLLVMSupport.a: unknown symbol `_ZTVN4llvm14error_categoryE'
|
||||
ghc: unable to load package `llvm-general-3.3.8.2'
|
||||
````
|
||||
|
||||
Code Generation (LLVM)
|
||||
======================
|
||||
|
||||
Resources
|
||||
---------
|
||||
|
||||
* [LLVM Language Reference](http://llvm.org/docs/LangRef.html)
|
||||
* [Implementing a JIT Compiled Language with Haskell and LLVM](http://www.stephendiehl.com/llvm/)
|
||||
|
||||
\clearpage
|
14
CONTRIBUTORS.md
Normal file
@ -0,0 +1,14 @@
|
||||
Contributors
|
||||
============
|
||||
|
||||
* Matthew Pickering
|
||||
* Nick Sinopoli
|
||||
* Nicolas Trangez
|
||||
* Ingo Blechschmidt
|
||||
* Rein Henrichs
|
||||
* Ian Connolly
|
||||
* Ben James
|
||||
* Abe Voelker
|
||||
* Paulo Tanimoto
|
||||
* Brandon Williams
|
||||
* Dmitry Ivanov
|
1
Makefile
@ -22,6 +22,7 @@ all: $(OBJ) top
|
||||
$(PANDOC) --filter ${FILTER} -f $(IFORMAT) $(FLAGS) -o $@ $<
|
||||
|
||||
pdf: $(FILTER)
|
||||
# $(PANDOC) --filter ${FILTER} -f $(IFORMAT) --template $(TEMPLATE_TEX) --latex-engine=xelatex $(FLAGS) -o WYAH.pdf title.md 0*.md contributing.md
|
||||
$(PANDOC) --filter ${FILTER} -f $(IFORMAT) --template $(TEMPLATE_TEX) --latex-engine=xelatex $(FLAGS) -o WYAH.pdf title.md 0*.md
|
||||
|
||||
epub: $(FILTER)
|
||||
|
@ -18,6 +18,7 @@
|
||||
|
||||
[![Build Status](https://travis-ci.org/sdiehl/write-you-a-haskell.svg)](https://travis-ci.org/sdiehl/write-you-a-haskell)
|
||||
[![Gitter](https://badges.gitter.im/Join%20Chat.svg)](https://gitter.im/sdiehl/write-you-a-haskell?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge&utm_content=body_badge)
|
||||
[![MIT License](http://img.shields.io/badge/license-mit-blue.svg)](https://github.com/sdiehl/write-you-a-haskell/blob/master/LICENSE)
|
||||
|
||||
Read Online:
|
||||
|
||||
|
0
chapter10/.gitkeep
Normal file
36
chapter10/generics.hs
Normal file
@ -0,0 +1,36 @@
|
||||
{-# LANGUAGE TypeOperators #-}
|
||||
{-# LANGUAGE FlexibleContexts #-}
|
||||
{-# LANGUAGE DefaultSignatures #-}
|
||||
|
||||
import GHC.Generics
|
||||
|
||||
-- Auxiliary class
|
||||
class GEq' f where
|
||||
geq' :: f a -> f a -> Bool
|
||||
|
||||
instance GEq' U1 where
|
||||
geq' _ _ = True
|
||||
|
||||
instance (GEq c) => GEq' (K1 i c) where
|
||||
geq' (K1 a) (K1 b) = geq a b
|
||||
|
||||
instance (GEq' a) => GEq' (M1 i c a) where
|
||||
geq' (M1 a) (M1 b) = geq' a b
|
||||
|
||||
instance (GEq' a, GEq' b) => GEq' (a :+: b) where
|
||||
geq' (L1 a) (L1 b) = geq' a b
|
||||
geq' (R1 a) (R1 b) = geq' a b
|
||||
geq' _ _ = False
|
||||
|
||||
instance (GEq' a, GEq' b) => GEq' (a :*: b) where
|
||||
geq' (a1 :*: b1) (a2 :*: b2) = geq' a1 a2 && geq' b1 b2
|
||||
|
||||
--
|
||||
class GEq a where
|
||||
geq :: a -> a -> Bool
|
||||
default geq :: (Generic a, GEq' (Rep a)) => a -> a -> Bool
|
||||
geq x y = geq' (from x) (from y)
|
||||
|
||||
instance GEq Char where geq = (==)
|
||||
instance GEq Int where geq = (==)
|
||||
instance GEq Float where geq = (==)
|
0
chapter11/.gitkeep
Normal file
0
chapter12/.gitkeep
Normal file
24
chapter12/systemf/Syntax.hs
Normal file
@ -0,0 +1,24 @@
|
||||
type Name = String
|
||||
type TypeVar = String
|
||||
type TypeCon = String
|
||||
|
||||
data Expr
|
||||
= Lam Type Name Expr -- \x -> a
|
||||
| Var Name -- x
|
||||
| App Expr Expr -- a b
|
||||
| TLam Name Expr -- /\ a . b
|
||||
| TApp Expr Type -- a [ b ]
|
||||
| Lit Literal -- 1
|
||||
| Let Name Expr Expr -- let x = v in a
|
||||
|
||||
data Type
|
||||
= TForall [Name] Type
|
||||
| TArr Type Type
|
||||
| TCon TypeCon
|
||||
| TVar TypeVar
|
||||
deriving (Show)
|
||||
|
||||
data Literal
|
||||
= LitInt Integer
|
||||
| LitChar Char
|
||||
deriving (Eq, Ord, Show)
|
9
chapter27/cfg/branch.dot
Normal file
@ -0,0 +1,9 @@
|
||||
digraph "CFG for 'foo' function" {
|
||||
graph [ dpi = 72 ];
|
||||
label="Unconditional Branch";
|
||||
Node0x103c3c0 [shape=record,label="{start:\l br label %next\l}"];
|
||||
Node0x103c3c0 -> Node0x1038a90;
|
||||
Node0x1038a90 [shape=record,label="{next: \l br label %return\l}"];
|
||||
Node0x1038a90 -> Node0x1038b90;
|
||||
Node0x1038b90 [shape=record,label="{return: \l ret i1 false\l}"];
|
||||
}
|
9
chapter27/cfg/branch.ll
Normal file
@ -0,0 +1,9 @@
|
||||
define i1 @foo() {
|
||||
start:
|
||||
br label %next
|
||||
next:
|
||||
br label %return
|
||||
return:
|
||||
ret i1 0
|
||||
}
|
||||
|
BIN
chapter27/cfg/branch.png
Normal file
After Width: | Height: | Size: 6.1 KiB |
10
chapter27/cfg/cbranch.dot
Normal file
@ -0,0 +1,10 @@
|
||||
digraph "CFG for 'foo' function" {
|
||||
graph [ dpi = 72 ];
|
||||
label="Conditional Branch";
|
||||
|
||||
Node0x1f793c0 [shape=record,label="{start:\l br i1 true, label %left, label %right\l|{<s0>T|<s1>F}}"];
|
||||
Node0x1f793c0:s0 -> Node0x1f75a20;
|
||||
Node0x1f793c0:s1 -> Node0x1f75b40;
|
||||
Node0x1f75a20 [shape=record,label="{left: \l ret i32 10\l}"];
|
||||
Node0x1f75b40 [shape=record,label="{right: \l ret i32 20\l}"];
|
||||
}
|
8
chapter27/cfg/cbranch.ll
Normal file
@ -0,0 +1,8 @@
|
||||
define i32 @foo() {
|
||||
start:
|
||||
br i1 true, label %left, label %right
|
||||
left:
|
||||
ret i32 10
|
||||
right:
|
||||
ret i32 20
|
||||
}
|
BIN
chapter27/cfg/cbranch.png
Normal file
After Width: | Height: | Size: 6.7 KiB |
11
chapter27/cfg/for.dot
Normal file
@ -0,0 +1,11 @@
|
||||
digraph "CFG for 'printstar' function" {
|
||||
graph [ dpi = 72 ];
|
||||
label="For loop";
|
||||
|
||||
Node0x11ea390 [shape=record,label="{entry:\l br label %loop\l}"];
|
||||
Node0x11ea390 -> Node0x11e72d0;
|
||||
Node0x11e72d0 [shape=record,label="{loop: \l %i = phi i32 [ 1, %entry ], [ %nextvar, %loop ]\l %nextvar = add i32 %i, 1\l %cmptmp = icmp ult i32 %i, %n\l %booltmp = zext i1 %cmptmp to i32\l %loopcond = icmp ne i32 %booltmp, 0\l br i1 %loopcond, label %loop, label %afterloop\l|{<s0>T|<s1>F}}"];
|
||||
Node0x11e72d0:s0 -> Node0x11e72d0;
|
||||
Node0x11e72d0:s1 -> Node0x11e7540;
|
||||
Node0x11e7540 [shape=record,label="{afterloop: \l ret i32 %i\l}"];
|
||||
}
|
17
chapter27/cfg/for.ll
Normal file
@ -0,0 +1,17 @@
|
||||
define i32 @printstar(i32 %n) {
|
||||
entry:
|
||||
br label %loop
|
||||
|
||||
loop:
|
||||
%i = phi i32 [ 1, %entry ], [ %nextvar, %loop ]
|
||||
%nextvar = add i32 %i, 1
|
||||
|
||||
%cmptmp = icmp ult i32 %i, %n
|
||||
%booltmp = zext i1 %cmptmp to i32
|
||||
%loopcond = icmp ne i32 %booltmp, 0
|
||||
|
||||
br i1 %loopcond, label %loop, label %afterloop
|
||||
|
||||
afterloop:
|
||||
ret i32 %i
|
||||
}
|
BIN
chapter27/cfg/for.png
Normal file
After Width: | Height: | Size: 15 KiB |
14
chapter27/cfg/phi.dot
Normal file
@ -0,0 +1,14 @@
|
||||
digraph "CFG for 'foo' function" {
|
||||
graph [ dpi = 72 ];
|
||||
|
||||
label="Phi node";
|
||||
|
||||
Node0x175c3c0 [shape=record,label="{start:\l br i1 true, label %left, label %right\l|{<s0>T|<s1>F}}"];
|
||||
Node0x175c3c0:s0 -> Node0x1758a20;
|
||||
Node0x175c3c0:s1 -> Node0x1758b40;
|
||||
Node0x1758a20 [shape=record,label="{left: \l %plusOne = add i32 0, 1\l br label %merge\l}"];
|
||||
Node0x1758a20 -> Node0x17593e0;
|
||||
Node0x1758b40 [shape=record,label="{right: \l br label %merge\l}"];
|
||||
Node0x1758b40 -> Node0x17593e0;
|
||||
Node0x17593e0 [shape=record,label="{merge: \l %join = phi i32 [ %plusOne, %left ], [ -1, %right ]\l ret i32 %join\l}"];
|
||||
}
|
12
chapter27/cfg/phi.ll
Normal file
@ -0,0 +1,12 @@
|
||||
define i32 @foo() {
|
||||
start:
|
||||
br i1 true, label %left, label %right
|
||||
left:
|
||||
%plusOne = add i32 0, 1
|
||||
br label %merge
|
||||
right:
|
||||
br label %merge
|
||||
merge:
|
||||
%join = phi i32 [ %plusOne, %left], [ -1, %right]
|
||||
ret i32 %join
|
||||
}
|
BIN
chapter27/cfg/phi.png
Normal file
After Width: | Height: | Size: 15 KiB |
6
chapter27/cfg/ret.dot
Normal file
@ -0,0 +1,6 @@
|
||||
digraph "CFG for 'foo' function" {
|
||||
graph [ dpi = 72 ];
|
||||
|
||||
label="Return";
|
||||
Node0x259a2b0 [shape=record,label="{%0:\l ret i1 false\l}"];
|
||||
}
|
4
chapter27/cfg/ret.ll
Normal file
@ -0,0 +1,4 @@
|
||||
define i1 @foo() {
|
||||
ret i1 0
|
||||
}
|
||||
|
BIN
chapter27/cfg/ret.png
Normal file
After Width: | Height: | Size: 2.1 KiB |
14
chapter27/cfg/switch.dot
Normal file
@ -0,0 +1,14 @@
|
||||
digraph "CFG for 'foo' function" {
|
||||
graph [ dpi = 72 ];
|
||||
label="Switch";
|
||||
|
||||
Node0x1027390 [shape=record,label="{entry:\l switch i32 %a, label %default [\l i32 0, label %f\l i32 1, label %g\l i32 2, label %h\l ]\l|{<s0>def|<s1>0|<s2>1|<s3>2}}"];
|
||||
Node0x1027390:s0 -> Node0x1024700;
|
||||
Node0x1027390:s1 -> Node0x1024300;
|
||||
Node0x1027390:s2 -> Node0x1024440;
|
||||
Node0x1027390:s3 -> Node0x1023920;
|
||||
Node0x1024300 [shape=record,label="{f: \l ret i32 1\l}"];
|
||||
Node0x1024440 [shape=record,label="{g: \l ret i32 2\l}"];
|
||||
Node0x1023920 [shape=record,label="{h: \l ret i32 3\l}"];
|
||||
Node0x1024700 [shape=record,label="{default: \l ret i32 0\l}"];
|
||||
}
|
14
chapter27/cfg/switch.ll
Normal file
@ -0,0 +1,14 @@
|
||||
define i32 @foo(i32 %a) {
|
||||
entry:
|
||||
switch i32 %a, label %default [ i32 0, label %f
|
||||
i32 1, label %g
|
||||
i32 2, label %h ]
|
||||
f:
|
||||
ret i32 1
|
||||
g:
|
||||
ret i32 2
|
||||
h:
|
||||
ret i32 3
|
||||
default:
|
||||
ret i32 0
|
||||
}
|
BIN
chapter27/cfg/switch.png
Normal file
After Width: | Height: | Size: 12 KiB |
281
chapter27/dsl/Codegen.hs
Normal file
@ -0,0 +1,281 @@
|
||||
{-# LANGUAGE OverloadedStrings #-}
|
||||
{-# LANGUAGE GeneralizedNewtypeDeriving #-}
|
||||
|
||||
module Codegen where
|
||||
|
||||
import Data.Word
|
||||
import Data.String
|
||||
import Data.List
|
||||
import Data.Function
|
||||
import qualified Data.Map as Map
|
||||
|
||||
import Control.Monad.State
|
||||
import Control.Applicative
|
||||
|
||||
import LLVM.General.AST
|
||||
import LLVM.General.AST.Global
|
||||
import qualified LLVM.General.AST as AST
|
||||
|
||||
import qualified LLVM.General.AST.Constant as C
|
||||
import qualified LLVM.General.AST.Attribute as A
|
||||
import qualified LLVM.General.AST.CallingConvention as CC
|
||||
import qualified LLVM.General.AST.FloatingPointPredicate as FP
|
||||
|
||||
-------------------------------------------------------------------------------
|
||||
-- Module Level
|
||||
-------------------------------------------------------------------------------
|
||||
|
||||
newtype LLVM a = LLVM { unLLVM :: State AST.Module a }
|
||||
deriving (Functor, Applicative, Monad, MonadState AST.Module )
|
||||
|
||||
runLLVM :: AST.Module -> LLVM a -> AST.Module
|
||||
runLLVM = flip (execState . unLLVM)
|
||||
|
||||
emptyModule :: String -> AST.Module
|
||||
emptyModule label = defaultModule { moduleName = label }
|
||||
|
||||
addDefn :: Definition -> LLVM ()
|
||||
addDefn d = do
|
||||
defs <- gets moduleDefinitions
|
||||
modify $ \s -> s { moduleDefinitions = defs ++ [d] }
|
||||
|
||||
define :: Type -> String -> [(Type, Name)] -> Codegen a -> LLVM ()
|
||||
define retty label argtys body = addDefn $
|
||||
GlobalDefinition $ functionDefaults {
|
||||
name = Name label
|
||||
, parameters = ([Parameter ty nm [] | (ty, nm) <- argtys], False)
|
||||
, returnType = retty
|
||||
, basicBlocks = bls
|
||||
}
|
||||
where
|
||||
bls = createBlocks $ execCodegen $ do
|
||||
enter <- addBlock entryBlockName
|
||||
void $ setBlock enter
|
||||
body
|
||||
|
||||
external :: Type -> String -> [(Type, Name)] -> LLVM ()
|
||||
external retty label argtys = addDefn $
|
||||
GlobalDefinition $ functionDefaults {
|
||||
name = Name label
|
||||
, parameters = ([Parameter ty nm [] | (ty, nm) <- argtys], False)
|
||||
, returnType = retty
|
||||
, basicBlocks = []
|
||||
}
|
||||
|
||||
---------------------------------------------------------------------------------
|
||||
-- Types
|
||||
-------------------------------------------------------------------------------
|
||||
|
||||
-- IEEE 754 double
|
||||
double :: Type
|
||||
double = FloatingPointType 64 IEEE
|
||||
|
||||
-------------------------------------------------------------------------------
|
||||
-- Names
|
||||
-------------------------------------------------------------------------------
|
||||
|
||||
type Names = Map.Map String Int
|
||||
|
||||
uniqueName :: String -> Names -> (String, Names)
|
||||
uniqueName nm ns =
|
||||
case Map.lookup nm ns of
|
||||
Nothing -> (nm, Map.insert nm 1 ns)
|
||||
Just ix -> (nm ++ show ix, Map.insert nm (ix+1) ns)
|
||||
|
||||
instance IsString Name where
|
||||
fromString = Name . fromString
|
||||
|
||||
-------------------------------------------------------------------------------
|
||||
-- Codegen State
|
||||
-------------------------------------------------------------------------------
|
||||
|
||||
type SymbolTable = [(String, Operand)]
|
||||
|
||||
data CodegenState
|
||||
= CodegenState {
|
||||
currentBlock :: Name -- Name of the active block to append to
|
||||
, blocks :: Map.Map Name BlockState -- Blocks for function
|
||||
, symtab :: SymbolTable -- Function scope symbol table
|
||||
, blockCount :: Int -- Count of basic blocks
|
||||
, count :: Word -- Count of unnamed instructions
|
||||
, names :: Names -- Name Supply
|
||||
} deriving Show
|
||||
|
||||
data BlockState
|
||||
= BlockState {
|
||||
idx :: Int -- Block index
|
||||
, stack :: [Named Instruction] -- Stack of instructions
|
||||
, term :: Maybe (Named Terminator) -- Block terminator
|
||||
} deriving Show
|
||||
|
||||
-------------------------------------------------------------------------------
|
||||
-- Codegen Operations
|
||||
-------------------------------------------------------------------------------
|
||||
|
||||
newtype Codegen a = Codegen { runCodegen :: State CodegenState a }
|
||||
deriving (Functor, Applicative, Monad, MonadState CodegenState )
|
||||
|
||||
sortBlocks :: [(Name, BlockState)] -> [(Name, BlockState)]
|
||||
sortBlocks = sortBy (compare `on` (idx . snd))
|
||||
|
||||
createBlocks :: CodegenState -> [BasicBlock]
|
||||
createBlocks m = map makeBlock $ sortBlocks $ Map.toList (blocks m)
|
||||
|
||||
makeBlock :: (Name, BlockState) -> BasicBlock
|
||||
makeBlock (l, (BlockState _ s t)) = BasicBlock l s (maketerm t)
|
||||
where
|
||||
maketerm (Just x) = x
|
||||
maketerm Nothing = error $ "Block has no terminator: " ++ (show l)
|
||||
|
||||
entryBlockName :: String
|
||||
entryBlockName = "entry"
|
||||
|
||||
emptyBlock :: Int -> BlockState
|
||||
emptyBlock i = BlockState i [] Nothing
|
||||
|
||||
emptyCodegen :: CodegenState
|
||||
emptyCodegen = CodegenState (Name entryBlockName) Map.empty [] 1 0 Map.empty
|
||||
|
||||
execCodegen :: Codegen a -> CodegenState
|
||||
execCodegen m = execState (runCodegen m) emptyCodegen
|
||||
|
||||
fresh :: Codegen Word
|
||||
fresh = do
|
||||
i <- gets count
|
||||
modify $ \s -> s { count = 1 + i }
|
||||
return $ i + 1
|
||||
|
||||
instr :: Instruction -> Codegen (Operand)
|
||||
instr ins = do
|
||||
n <- fresh
|
||||
let ref = (UnName n)
|
||||
blk <- current
|
||||
let i = stack blk
|
||||
modifyBlock (blk { stack = i ++ [ref := ins] } )
|
||||
return $ local ref
|
||||
|
||||
terminator :: Named Terminator -> Codegen (Named Terminator)
|
||||
terminator trm = do
|
||||
blk <- current
|
||||
modifyBlock (blk { term = Just trm })
|
||||
return trm
|
||||
|
||||
-------------------------------------------------------------------------------
|
||||
-- Block Stack
|
||||
-------------------------------------------------------------------------------
|
||||
|
||||
entry :: Codegen Name
|
||||
entry = gets currentBlock
|
||||
|
||||
addBlock :: String -> Codegen Name
|
||||
addBlock bname = do
|
||||
bls <- gets blocks
|
||||
ix <- gets blockCount
|
||||
nms <- gets names
|
||||
let new = emptyBlock ix
|
||||
(qname, supply) = uniqueName bname nms
|
||||
modify $ \s -> s { blocks = Map.insert (Name qname) new bls
|
||||
, blockCount = ix + 1
|
||||
, names = supply
|
||||
}
|
||||
return (Name qname)
|
||||
|
||||
setBlock :: Name -> Codegen Name
|
||||
setBlock bname = do
|
||||
modify $ \s -> s { currentBlock = bname }
|
||||
return bname
|
||||
|
||||
getBlock :: Codegen Name
|
||||
getBlock = gets currentBlock
|
||||
|
||||
modifyBlock :: BlockState -> Codegen ()
|
||||
modifyBlock new = do
|
||||
active <- gets currentBlock
|
||||
modify $ \s -> s { blocks = Map.insert active new (blocks s) }
|
||||
|
||||
current :: Codegen BlockState
|
||||
current = do
|
||||
c <- gets currentBlock
|
||||
blks <- gets blocks
|
||||
case Map.lookup c blks of
|
||||
Just x -> return x
|
||||
Nothing -> error $ "No such block: " ++ show c
|
||||
|
||||
-------------------------------------------------------------------------------
|
||||
-- Symbol Table
|
||||
-------------------------------------------------------------------------------
|
||||
|
||||
assign :: String -> Operand -> Codegen ()
|
||||
assign var x = do
|
||||
lcls <- gets symtab
|
||||
modify $ \s -> s { symtab = [(var, x)] ++ lcls }
|
||||
|
||||
getvar :: String -> Codegen Operand
|
||||
getvar var = do
|
||||
syms <- gets symtab
|
||||
case lookup var syms of
|
||||
Just x -> return x
|
||||
Nothing -> error $ "Local variable not in scope: " ++ show var
|
||||
|
||||
-------------------------------------------------------------------------------
|
||||
|
||||
-- References
|
||||
local :: Name -> Operand
|
||||
local = LocalReference
|
||||
|
||||
global :: Name -> C.Constant
|
||||
global = C.GlobalReference
|
||||
|
||||
externf :: Name -> Operand
|
||||
externf = ConstantOperand . C.GlobalReference
|
||||
|
||||
-- Arithmetic and Constants
|
||||
fadd :: Operand -> Operand -> Codegen Operand
|
||||
fadd a b = instr $ FAdd a b []
|
||||
|
||||
fsub :: Operand -> Operand -> Codegen Operand
|
||||
fsub a b = instr $ FSub a b []
|
||||
|
||||
fmul :: Operand -> Operand -> Codegen Operand
|
||||
fmul a b = instr $ FMul a b []
|
||||
|
||||
fdiv :: Operand -> Operand -> Codegen Operand
|
||||
fdiv a b = instr $ FDiv a b []
|
||||
|
||||
fcmp :: FP.FloatingPointPredicate -> Operand -> Operand -> Codegen Operand
|
||||
fcmp cond a b = instr $ FCmp cond a b []
|
||||
|
||||
cons :: C.Constant -> Operand
|
||||
cons = ConstantOperand
|
||||
|
||||
uitofp :: Type -> Operand -> Codegen Operand
|
||||
uitofp ty a = instr $ UIToFP a ty []
|
||||
|
||||
toArgs :: [Operand] -> [(Operand, [A.ParameterAttribute])]
|
||||
toArgs = map (\x -> (x, []))
|
||||
|
||||
-- Effects
|
||||
call :: Operand -> [Operand] -> Codegen Operand
|
||||
call fn args = instr $ Call False CC.C [] (Right fn) (toArgs args) [] []
|
||||
|
||||
alloca :: Type -> Codegen Operand
|
||||
alloca ty = instr $ Alloca ty Nothing 0 []
|
||||
|
||||
store :: Operand -> Operand -> Codegen Operand
|
||||
store ptr val = instr $ Store False ptr val Nothing 0 []
|
||||
|
||||
load :: Operand -> Codegen Operand
|
||||
load ptr = instr $ Load False ptr Nothing 0 []
|
||||
|
||||
-- Control Flow
|
||||
br :: Name -> Codegen (Named Terminator)
|
||||
br val = terminator $ Do $ Br val []
|
||||
|
||||
cbr :: Operand -> Name -> Name -> Codegen (Named Terminator)
|
||||
cbr cond tr fl = terminator $ Do $ CondBr cond tr fl []
|
||||
|
||||
phi :: Type -> [(Operand, Name)] -> Codegen Operand
|
||||
phi ty incoming = instr $ Phi ty incoming []
|
||||
|
||||
ret :: Operand -> Codegen (Named Terminator)
|
||||
ret val = terminator $ Do $ Ret (Just val) []
|
58
chapter27/dsl/JIT.hs
Normal file
@ -0,0 +1,58 @@
|
||||
module JIT where
|
||||
|
||||
import Data.Int
|
||||
import Data.Word
|
||||
import Foreign.Ptr ( FunPtr, castFunPtr )
|
||||
|
||||
import Control.Monad.Error
|
||||
|
||||
import LLVM.General.Target
|
||||
import LLVM.General.Context
|
||||
import LLVM.General.CodeModel
|
||||
import LLVM.General.Module as Mod
|
||||
import qualified LLVM.General.AST as AST
|
||||
|
||||
import LLVM.General.PassManager
|
||||
import LLVM.General.Transforms
|
||||
import LLVM.General.Analysis
|
||||
|
||||
import qualified LLVM.General.ExecutionEngine as EE
|
||||
|
||||
foreign import ccall "dynamic" haskFun :: FunPtr (IO Double) -> (IO Double)
|
||||
|
||||
run :: FunPtr a -> IO Double
|
||||
run fn = haskFun (castFunPtr fn :: FunPtr (IO Double))
|
||||
|
||||
jit :: Context -> (EE.MCJIT -> IO a) -> IO a
|
||||
jit c = EE.withMCJIT c optlevel model ptrelim fastins
|
||||
where
|
||||
optlevel = Just 0 -- optimization level
|
||||
model = Nothing -- code model ( Default )
|
||||
ptrelim = Nothing -- frame pointer elimination
|
||||
fastins = Nothing -- fast instruction selection
|
||||
|
||||
passes :: PassSetSpec
|
||||
passes = defaultCuratedPassSetSpec { optLevel = Just 3 }
|
||||
|
||||
runJIT :: AST.Module -> IO (Either String AST.Module)
|
||||
runJIT mod = do
|
||||
withContext $ \context ->
|
||||
jit context $ \executionEngine ->
|
||||
runErrorT $ withModuleFromAST context mod $ \m ->
|
||||
withPassManager passes $ \pm -> do
|
||||
-- Optimization Pass
|
||||
{-runPassManager pm m-}
|
||||
optmod <- moduleAST m
|
||||
s <- moduleLLVMAssembly m
|
||||
putStrLn s
|
||||
|
||||
EE.withModuleInEngine executionEngine m $ \ee -> do
|
||||
mainfn <- EE.getFunction ee (AST.Name "main")
|
||||
case mainfn of
|
||||
Just fn -> do
|
||||
res <- run fn
|
||||
putStrLn $ "Evaluated to: " ++ show res
|
||||
Nothing -> return ()
|
||||
|
||||
-- Return the optimized module
|
||||
return optmod
|
32
chapter27/dsl/Main.hs
Normal file
@ -0,0 +1,32 @@
|
||||
import JIT
|
||||
import Codegen
|
||||
import qualified LLVM.General.AST as AST
|
||||
import qualified LLVM.General.AST.Float as F
|
||||
import qualified LLVM.General.AST.Constant as C
|
||||
|
||||
{-
|
||||
|
||||
; ModuleID = 'my cool jit'
|
||||
|
||||
define double @main() {
|
||||
entry:
|
||||
%1 = fadd double 1.000000e+01, 2.000000e+01
|
||||
ret double %1
|
||||
}
|
||||
|
||||
-}
|
||||
|
||||
initModule :: AST.Module
|
||||
initModule = emptyModule "my cool jit"
|
||||
|
||||
logic = do
|
||||
define double "main" [] $ do
|
||||
let a = cons $ C.Float (F.Double 10)
|
||||
let b = cons $ C.Float (F.Double 20)
|
||||
res <- fadd a b
|
||||
ret res
|
||||
|
||||
main = do
|
||||
let ast = runLLVM initModule logic
|
||||
runJIT ast
|
||||
return ast
|
22
chapter27/dsl/dsl.cabal
Normal file
@ -0,0 +1,22 @@
|
||||
name: dsl
|
||||
version: 0.1.0.0
|
||||
--synopsis:
|
||||
--description:
|
||||
license: MIT
|
||||
license-file: LICENSE-MIT
|
||||
author: Stephen Diehl
|
||||
maintainer: stephen.m.diehl@gmail.com
|
||||
copyright: 2015 Stephen Diehl
|
||||
Category: Compilers
|
||||
build-type: Simple
|
||||
cabal-version: >=1.10
|
||||
|
||||
library
|
||||
default-language: Haskell2010
|
||||
build-depends:
|
||||
base >= 4.6
|
||||
, haskeline >= 0.7.1.2
|
||||
, llvm-general == 3.4.4.*
|
||||
, llvm-general-pure == 3.4.4.*
|
||||
, mtl >= 2.2
|
||||
, transformers
|
93
chapter27/example.cpp
Normal file
@ -0,0 +1,93 @@
|
||||
// Generated by llvm2cpp - DO NOT MODIFY!
|
||||
|
||||
#include <llvm/LLVMContext.h>
|
||||
#include <llvm/Module.h>
|
||||
#include <llvm/DerivedTypes.h>
|
||||
#include <llvm/Constants.h>
|
||||
#include <llvm/GlobalVariable.h>
|
||||
#include <llvm/Function.h>
|
||||
#include <llvm/CallingConv.h>
|
||||
#include <llvm/BasicBlock.h>
|
||||
#include <llvm/Instructions.h>
|
||||
#include <llvm/InlineAsm.h>
|
||||
#include <llvm/Support/FormattedStream.h>
|
||||
#include <llvm/Support/MathExtras.h>
|
||||
#include <llvm/Pass.h>
|
||||
#include <llvm/PassManager.h>
|
||||
#include <llvm/ADT/SmallVector.h>
|
||||
#include <llvm/Analysis/Verifier.h>
|
||||
#include <llvm/Assembly/PrintModulePass.h>
|
||||
#include <algorithm>
|
||||
using namespace llvm;
|
||||
|
||||
Module* makeLLVMModule();
|
||||
|
||||
int main(int argc, char**argv) {
|
||||
Module* Mod = makeLLVMModule();
|
||||
verifyModule(*Mod, PrintMessageAction);
|
||||
PassManager PM;
|
||||
PM.add(createPrintModulePass(&outs()));
|
||||
PM.run(*Mod);
|
||||
return 0;
|
||||
}
|
||||
|
||||
|
||||
Module* makeLLVMModule() {
|
||||
// Module Construction
|
||||
Module* mod = new Module("example.ll", getGlobalContext());
|
||||
|
||||
// Type Definitions
|
||||
std::vector<Type*>FuncTy_0_args;
|
||||
FuncTy_0_args.push_back(IntegerType::get(mod->getContext(), 32));
|
||||
FuncTy_0_args.push_back(IntegerType::get(mod->getContext(), 32));
|
||||
FuncTy_0_args.push_back(IntegerType::get(mod->getContext(), 32));
|
||||
FunctionType* FuncTy_0 = FunctionType::get(
|
||||
/*Result=*/IntegerType::get(mod->getContext(), 32),
|
||||
/*Params=*/FuncTy_0_args,
|
||||
/*isVarArg=*/false);
|
||||
|
||||
|
||||
// Function Declarations
|
||||
|
||||
Function* func_test1 = mod->getFunction("test1");
|
||||
if (!func_test1) {
|
||||
func_test1 = Function::Create(
|
||||
/*Type=*/FuncTy_0,
|
||||
/*Linkage=*/GlobalValue::ExternalLinkage,
|
||||
/*Name=*/"test1", mod);
|
||||
func_test1->setCallingConv(CallingConv::C);
|
||||
}
|
||||
AttrListPtr func_test1_PAL;
|
||||
func_test1->setAttributes(func_test1_PAL);
|
||||
|
||||
// Global Variable Declarations
|
||||
|
||||
|
||||
// Constant Definitions
|
||||
|
||||
// Global Variable Definitions
|
||||
|
||||
// Function Definitions
|
||||
|
||||
// Function: test1 (func_test1)
|
||||
{
|
||||
Function::arg_iterator args = func_test1->arg_begin();
|
||||
Value* int32_x = args++;
|
||||
int32_x->setName("x");
|
||||
Value* int32_y = args++;
|
||||
int32_y->setName("y");
|
||||
Value* int32_z = args++;
|
||||
int32_z->setName("z");
|
||||
|
||||
BasicBlock* label_1 = BasicBlock::Create(mod->getContext(), "",func_test1,0);
|
||||
|
||||
// Block (label_1)
|
||||
BinaryOperator* int32_a = BinaryOperator::Create(Instruction::And, int32_z, int32_x, "a", label_1);
|
||||
BinaryOperator* int32_b = BinaryOperator::Create(Instruction::And, int32_z, int32_y, "b", label_1);
|
||||
BinaryOperator* int32_c = BinaryOperator::Create(Instruction::Xor, int32_a, int32_b, "c", label_1);
|
||||
ReturnInst::Create(mod->getContext(), int32_c, label_1);
|
||||
|
||||
}
|
||||
|
||||
return mod;
|
||||
}
|
6
chapter27/example.ll
Normal file
@ -0,0 +1,6 @@
|
||||
define i32 @test1(i32 %x, i32 %y, i32 %z) {
|
||||
%a = and i32 %z, %x
|
||||
%b = and i32 %z, %y
|
||||
%c = xor i32 %a, %b
|
||||
ret i32 %c
|
||||
}
|
@ -22,7 +22,7 @@ main :: IO ()
|
||||
main = runInputT defaultSettings loop
|
||||
where
|
||||
loop = do
|
||||
minput <- getInputLine "Stlc> "
|
||||
minput <- getInputLine "Happy> "
|
||||
case minput of
|
||||
Nothing -> outputStrLn "Goodbye."
|
||||
Just input -> (liftIO $ process input) >> loop
|
||||
|
3
chapter9/assign/.gitignore
vendored
Normal file
@ -0,0 +1,3 @@
|
||||
Lexer.hs
|
||||
Parser.hs
|
||||
Main
|
34
chapter9/assign/Eval.hs
Normal file
@ -0,0 +1,34 @@
|
||||
module Eval (eval) where
|
||||
|
||||
import Syntax
|
||||
import Control.Monad.State
|
||||
import qualified Data.Map as Map
|
||||
|
||||
data Value
|
||||
= VInt Int
|
||||
| VUnit
|
||||
|
||||
instance Show Value where
|
||||
show (VInt x) = show x
|
||||
|
||||
type Eval = StateT Env IO
|
||||
type Env = [(String, Value)]
|
||||
|
||||
eval1 :: Expr -> Eval Value
|
||||
eval1 expr = case expr of
|
||||
Num a -> return (VInt a)
|
||||
Var a -> do
|
||||
env <- get
|
||||
case lookup a env of
|
||||
Just val -> return val
|
||||
Nothing -> error "Not in scope"
|
||||
Print a -> do
|
||||
a' <- eval1 a
|
||||
liftIO $ print a'
|
||||
return VUnit
|
||||
Assign ref val -> do
|
||||
modify $ \s -> (ref, VInt val) : s
|
||||
return VUnit
|
||||
|
||||
eval :: [Expr] -> IO ()
|
||||
eval xs = evalStateT (mapM_ eval1 xs) []
|
38
chapter9/assign/Lexer.x
Normal file
@ -0,0 +1,38 @@
|
||||
{
|
||||
module Lexer (
|
||||
Token(..),
|
||||
scanTokens
|
||||
) where
|
||||
|
||||
import Syntax
|
||||
}
|
||||
|
||||
%wrapper "basic"
|
||||
|
||||
$digit = 0-9
|
||||
$alpha = [a-zA-Z]
|
||||
$eol = [\n]
|
||||
|
||||
tokens :-
|
||||
|
||||
-- Whitespace insensitive
|
||||
$eol ;
|
||||
$white+ ;
|
||||
print { \s -> TokenPrint }
|
||||
$digit+ { \s -> TokenNum (read s) }
|
||||
\= { \s -> TokenEq }
|
||||
$alpha [$alpha $digit \_ \']* { \s -> TokenSym s }
|
||||
|
||||
{
|
||||
|
||||
data Token
|
||||
= TokenNum Int
|
||||
| TokenSym String
|
||||
| TokenPrint
|
||||
| TokenEq
|
||||
| TokenEOF
|
||||
deriving (Eq,Show)
|
||||
|
||||
scanTokens = alexScanTokens
|
||||
|
||||
}
|
21
chapter9/assign/Main.hs
Normal file
@ -0,0 +1,21 @@
|
||||
import Eval (eval)
|
||||
import Parser (parseExpr)
|
||||
import System.Environment
|
||||
|
||||
process :: String -> IO ()
|
||||
process input = do
|
||||
let ast = parseExpr input
|
||||
case ast of
|
||||
Right ast -> eval ast
|
||||
Left err -> do
|
||||
putStrLn "Parser Error:"
|
||||
print err
|
||||
|
||||
main :: IO ()
|
||||
main = do
|
||||
args <- getArgs
|
||||
case args of
|
||||
[] -> putStrLn "Usage: assign <input file>"
|
||||
[fname] -> do
|
||||
contents <- readFile fname
|
||||
process contents
|
6
chapter9/assign/Makefile
Normal file
@ -0,0 +1,6 @@
|
||||
all:
|
||||
alex Lexer.x
|
||||
happy Parser.y
|
||||
ghc --make Main -o Main
|
||||
clean:
|
||||
rm -f *.o *.hi Parser.hs Lexer.hs Main
|
46
chapter9/assign/Parser.y
Normal file
@ -0,0 +1,46 @@
|
||||
{
|
||||
{-# LANGUAGE GeneralizedNewtypeDeriving #-}
|
||||
|
||||
module Parser (
|
||||
parseExpr,
|
||||
) where
|
||||
|
||||
import Lexer
|
||||
import Syntax
|
||||
|
||||
import Control.Monad.Except
|
||||
}
|
||||
|
||||
%name expr
|
||||
%tokentype { Token }
|
||||
%monad { Except String } { (>>=) } { return }
|
||||
%error { parseError }
|
||||
|
||||
%token
|
||||
int { TokenNum $$ }
|
||||
var { TokenSym $$ }
|
||||
print { TokenPrint }
|
||||
'=' { TokenEq }
|
||||
|
||||
%%
|
||||
|
||||
terms
|
||||
: term { [$1] }
|
||||
| term terms { $1 : $2 }
|
||||
|
||||
term
|
||||
: var { Var $1 }
|
||||
| var '=' int { Assign $1 $3 }
|
||||
| print term { Print $2 }
|
||||
|
||||
{
|
||||
|
||||
parseError :: [Token] -> Except String a
|
||||
parseError (l:ls) = throwError (show l)
|
||||
parseError [] = throwError "Unexpected end of Input"
|
||||
|
||||
parseExpr :: String -> Either String [Expr]
|
||||
parseExpr input =
|
||||
let tokenStream = scanTokens input in
|
||||
runExcept (expr tokenStream)
|
||||
}
|
8
chapter9/assign/Syntax.hs
Normal file
@ -0,0 +1,8 @@
|
||||
module Syntax where
|
||||
|
||||
data Expr
|
||||
= Var String
|
||||
| Num Int
|
||||
| Print Expr
|
||||
| Assign String Int
|
||||
deriving (Eq,Show)
|
19
chapter9/assign/happy.cabal
Normal file
@ -0,0 +1,19 @@
|
||||
name: assign
|
||||
version: 0.1.0.0
|
||||
build-type: Simple
|
||||
extra-source-files: README.md
|
||||
cabal-version: >=1.10
|
||||
|
||||
executable assign
|
||||
build-depends:
|
||||
base >= 4.6 && <4.7
|
||||
, containers >= 0.5 && <0.6
|
||||
, mtl >= 2.2
|
||||
default-language: Haskell2010
|
||||
main-is: Main.hs
|
||||
|
||||
Build-depends: base, array
|
||||
build-tools: alex, happy
|
||||
other-modules:
|
||||
Parser,
|
||||
Lexer
|
6
chapter9/assign/input.test
Normal file
@ -0,0 +1,6 @@
|
||||
x = 4
|
||||
print x
|
||||
y = 5
|
||||
print y
|
||||
y = 6
|
||||
print y
|
3
chapter9/happy/.gitignore
vendored
Normal file
@ -0,0 +1,3 @@
|
||||
Lexer.hs
|
||||
Parser.hs
|
||||
Main
|
52
chapter9/happy/Eval.hs
Normal file
@ -0,0 +1,52 @@
|
||||
module Eval where
|
||||
import Syntax
|
||||
|
||||
import Control.Monad.Identity
|
||||
import qualified Data.Map as Map
|
||||
|
||||
data Value
|
||||
= VInt Integer
|
||||
| VBool Bool
|
||||
| VClosure String Expr (Eval.Scope)
|
||||
|
||||
instance Show Value where
|
||||
show (VInt x) = show x
|
||||
show (VBool x) = show x
|
||||
show VClosure{} = "<<closure>>"
|
||||
|
||||
type Evaluate t = Identity t
|
||||
type Scope = Map.Map String Value
|
||||
|
||||
eval :: Eval.Scope -> Expr -> Identity Value
|
||||
eval env expr = case expr of
|
||||
Lit (LInt x) -> return $ VInt (fromIntegral x)
|
||||
Lit (LBool x) -> return $ VBool x
|
||||
Var x -> return $ env Map.! x
|
||||
Lam x body -> return (VClosure x body env)
|
||||
App a b -> do
|
||||
x <- eval env a
|
||||
y <- eval env b
|
||||
apply x y
|
||||
Op op a b -> do
|
||||
x <- eval env a
|
||||
y <- eval env b
|
||||
return $ binop op x y
|
||||
|
||||
binop :: Binop -> Value -> Value -> Value
|
||||
binop Add (VInt a) (VInt b) = VInt (a+b)
|
||||
binop Sub (VInt a) (VInt b) = VInt (a-b)
|
||||
binop Mul (VInt a) (VInt b) = VInt (a*b)
|
||||
binop Eql (VInt a) (VInt b) = VBool (a==b)
|
||||
|
||||
extend :: Scope -> String -> Value -> Scope
|
||||
extend env v t = Map.insert v t env
|
||||
|
||||
apply :: Value -> Value -> Evaluate Value
|
||||
apply (VClosure v t0 e) t1 = eval (extend e v t1) t0
|
||||
apply _ _ = error "Tried to apply closure"
|
||||
|
||||
emptyScope :: Scope
|
||||
emptyScope = Map.empty
|
||||
|
||||
runEval :: Expr -> Value
|
||||
runEval x = runIdentity (eval emptyScope x)
|
63
chapter9/happy/Lexer.x
Normal file
@ -0,0 +1,63 @@
|
||||
{
|
||||
module Lexer (
|
||||
Token(..),
|
||||
scanTokens
|
||||
) where
|
||||
|
||||
import Syntax
|
||||
}
|
||||
|
||||
%wrapper "basic"
|
||||
|
||||
$digit = 0-9
|
||||
$alpha = [a-zA-Z]
|
||||
$eol = [\n]
|
||||
|
||||
tokens :-
|
||||
|
||||
-- Whitespace insensitive
|
||||
$eol ;
|
||||
$white+ ;
|
||||
|
||||
-- Comments
|
||||
"#".* ;
|
||||
|
||||
-- Syntax
|
||||
let { \s -> TokenLet }
|
||||
True { \s -> TokenTrue }
|
||||
False { \s -> TokenFalse }
|
||||
in { \s -> TokenIn }
|
||||
$digit+ { \s -> TokenNum (read s) }
|
||||
"->" { \s -> TokenArrow }
|
||||
\= { \s -> TokenEq }
|
||||
\\ { \s -> TokenLambda }
|
||||
[\+] { \s -> TokenAdd }
|
||||
[\-] { \s -> TokenSub }
|
||||
[\*] { \s -> TokenMul }
|
||||
\( { \s -> TokenLParen }
|
||||
\) { \s -> TokenRParen }
|
||||
$alpha [$alpha $digit \_ \']* { \s -> TokenSym s }
|
||||
|
||||
{
|
||||
|
||||
data Token
|
||||
= TokenLet
|
||||
| TokenTrue
|
||||
| TokenFalse
|
||||
| TokenIn
|
||||
| TokenLambda
|
||||
| TokenNum Int
|
||||
| TokenSym String
|
||||
| TokenArrow
|
||||
| TokenEq
|
||||
| TokenAdd
|
||||
| TokenSub
|
||||
| TokenMul
|
||||
| TokenLParen
|
||||
| TokenRParen
|
||||
| TokenEOF
|
||||
deriving (Eq,Show)
|
||||
|
||||
scanTokens = alexScanTokens
|
||||
|
||||
}
|
25
chapter9/happy/Main.hs
Normal file
@ -0,0 +1,25 @@
|
||||
import Eval
|
||||
import Parser (parseExpr, parseTokens)
|
||||
|
||||
import Control.Monad.Trans
|
||||
import System.Console.Haskeline
|
||||
|
||||
process :: String -> IO ()
|
||||
process input = do
|
||||
let tokens = parseTokens input
|
||||
print tokens
|
||||
let ast = parseExpr input
|
||||
case ast of
|
||||
Left err -> do
|
||||
putStrLn "Parser Error:"
|
||||
print err
|
||||
Right ast -> print $ runEval ast
|
||||
|
||||
main :: IO ()
|
||||
main = runInputT defaultSettings loop
|
||||
where
|
||||
loop = do
|
||||
minput <- getInputLine "Happy> "
|
||||
case minput of
|
||||
Nothing -> outputStrLn "Goodbye."
|
||||
Just input -> (liftIO $ process input) >> loop
|
6
chapter9/happy/Makefile
Normal file
@ -0,0 +1,6 @@
|
||||
all:
|
||||
alex Lexer.x
|
||||
happy Parser.y
|
||||
ghc --make Main -o Main
|
||||
clean:
|
||||
rm -f *.o *.hi Parser.hs Lexer.hs Main
|
83
chapter9/happy/Parser.y
Normal file
@ -0,0 +1,83 @@
|
||||
{
|
||||
{-# LANGUAGE GeneralizedNewtypeDeriving #-}
|
||||
|
||||
module Parser (
|
||||
parseExpr,
|
||||
parseTokens,
|
||||
) where
|
||||
|
||||
import Lexer
|
||||
import Syntax
|
||||
|
||||
import Control.Monad.Except
|
||||
|
||||
}
|
||||
|
||||
-- Entry point
|
||||
%name expr
|
||||
|
||||
-- Entry point
|
||||
%name expr
|
||||
|
||||
-- Lexer structure
|
||||
%tokentype { Token }
|
||||
|
||||
-- Parser monad
|
||||
%monad { Except String } { (>>=) } { return }
|
||||
%error { parseError }
|
||||
|
||||
-- Token Names
|
||||
%token
|
||||
let { TokenLet }
|
||||
true { TokenTrue }
|
||||
false { TokenFalse }
|
||||
in { TokenIn }
|
||||
NUM { TokenNum $$ }
|
||||
VAR { TokenSym $$ }
|
||||
'\\' { TokenLambda }
|
||||
'->' { TokenArrow }
|
||||
'=' { TokenEq }
|
||||
'+' { TokenAdd }
|
||||
'-' { TokenSub }
|
||||
'*' { TokenMul }
|
||||
'(' { TokenLParen }
|
||||
')' { TokenRParen }
|
||||
|
||||
-- Operators
|
||||
%left '+' '-'
|
||||
%left '*'
|
||||
%%
|
||||
|
||||
Expr : let VAR '=' Expr in Expr { App (Lam $2 $6) $4 }
|
||||
| '\\' VAR '->' Expr { Lam $2 $4 }
|
||||
| Form { $1 }
|
||||
|
||||
Form : Form '+' Form { Op Add $1 $3 }
|
||||
| Form '-' Form { Op Sub $1 $3 }
|
||||
| Form '*' Form { Op Mul $1 $3 }
|
||||
| Fact { $1 }
|
||||
|
||||
Fact : Fact Atom { App $1 $2 }
|
||||
| Atom { $1 }
|
||||
|
||||
Atom : '(' Expr ')' { $2 }
|
||||
| NUM { Lit (LInt $1) }
|
||||
| VAR { Var $1 }
|
||||
| true { Lit (LBool True) }
|
||||
| false { Lit (LBool True) }
|
||||
|
||||
{
|
||||
|
||||
parseError :: [Token] -> Except String a
|
||||
parseError (l:ls) = throwError (show l)
|
||||
parseError [] = throwError "Unexpected end of Input"
|
||||
|
||||
parseExpr :: String -> Either String Expr
|
||||
parseExpr input =
|
||||
let tokenStream = scanTokens input in
|
||||
runExcept (expr tokenStream)
|
||||
|
||||
parseTokens :: String -> [Token]
|
||||
parseTokens = scanTokens
|
||||
|
||||
}
|
19
chapter9/happy/Syntax.hs
Normal file
@ -0,0 +1,19 @@
|
||||
module Syntax where
|
||||
|
||||
type Name = String
|
||||
|
||||
data Expr
|
||||
= Lam Name Expr
|
||||
| App Expr Expr
|
||||
| Var Name
|
||||
| Lit Lit
|
||||
| Op Binop Expr Expr
|
||||
deriving (Eq,Show)
|
||||
|
||||
data Lit
|
||||
= LInt Int
|
||||
| LBool Bool
|
||||
deriving (Show, Eq, Ord)
|
||||
|
||||
data Binop = Add | Sub | Mul | Eql
|
||||
deriving (Eq, Ord, Show)
|
27
chapter9/happy/happy.cabal
Normal file
@ -0,0 +1,27 @@
|
||||
name: happyParser
|
||||
version: 0.1.0.0
|
||||
license: MIT
|
||||
license-file: LICENSE
|
||||
author: Stephen Diehl
|
||||
maintainer: stephen.m.diehl@gmail.com
|
||||
build-type: Simple
|
||||
extra-source-files: README.md
|
||||
cabal-version: >=1.10
|
||||
|
||||
executable happyParser
|
||||
build-depends:
|
||||
base >= 4.6 && <4.7
|
||||
, pretty >= 1.1 && <1.2
|
||||
, parsec >= 3.1 && <3.2
|
||||
, containers >= 0.5 && <0.6
|
||||
, haskeline >= 0.7
|
||||
, mtl
|
||||
, transformers
|
||||
default-language: Haskell2010
|
||||
main-is: Main.hs
|
||||
|
||||
Build-depends: base, array
|
||||
build-tools: alex, happy
|
||||
other-modules:
|
||||
Parser,
|
||||
Lexer
|
65
chapter9/layout/Layout.hs
Normal file
@ -0,0 +1,65 @@
|
||||
{-# LANGUAGE FlexibleContexts #-}
|
||||
|
||||
module Layout (
|
||||
-- * Layout combinators
|
||||
IParsec,
|
||||
laidout,
|
||||
indented,
|
||||
align,
|
||||
runIndentParser,
|
||||
) where
|
||||
|
||||
import Data.Text.Lazy
|
||||
|
||||
import Text.Parsec (ParseError)
|
||||
import Text.Parsec.Pos
|
||||
import Text.Parsec.Prim hiding (State)
|
||||
|
||||
import Control.Monad.Identity
|
||||
import Control.Applicative ((<$>))
|
||||
|
||||
-- Indentation sensitive Parsec monad.
|
||||
type IParsec a = Parsec Text ParseState a
|
||||
|
||||
data ParseState = ParseState
|
||||
{ indents :: Column
|
||||
} deriving (Show)
|
||||
|
||||
initParseState :: ParseState
|
||||
initParseState = ParseState 0
|
||||
|
||||
indentCmp
|
||||
:: (Column -> Column -> Bool)
|
||||
-> IParsec ()
|
||||
indentCmp cmp = do
|
||||
col <- sourceColumn <$> getPosition
|
||||
current <- indents <$> getState
|
||||
guard (col `cmp` current)
|
||||
|
||||
withIndent :: Monad m =>Column-> Column -> ParsecT s ParseState m b -> ParsecT s ParseState m b
|
||||
withIndent cur pos m = do
|
||||
modifyState $ \st -> st { indents = pos }
|
||||
res <- m
|
||||
modifyState $ \st -> st { indents = cur }
|
||||
return res
|
||||
|
||||
laidout :: Parsec s ParseState a -> Parsec s ParseState a
|
||||
laidout m = do
|
||||
cur <- indents <$> getState
|
||||
pos <- sourceColumn <$> getPosition
|
||||
res <- withIndent cur pos m
|
||||
return res
|
||||
|
||||
indented :: IParsec ()
|
||||
indented = indentCmp (>) <?> "Block (indented)"
|
||||
|
||||
align :: IParsec ()
|
||||
align = indentCmp (==) <?> "Block (same indentation)"
|
||||
|
||||
runIndentParser
|
||||
:: Stream Text Identity a
|
||||
=> SourceName
|
||||
-> IParsec a
|
||||
-> Text
|
||||
-> Either ParseError a
|
||||
runIndentParser filePath p = runParser p initParseState filePath
|
42
chapter9/provenance/Eval.hs
Normal file
@ -0,0 +1,42 @@
|
||||
module Eval where
|
||||
|
||||
import Syntax
|
||||
|
||||
import Control.Monad.Identity
|
||||
import qualified Data.Map as Map
|
||||
|
||||
data Value
|
||||
= VInt Integer
|
||||
| VBool Bool
|
||||
| VClosure String Expr TermEnv
|
||||
|
||||
type TermEnv = Map.Map String Value
|
||||
type Interpreter t = Identity t
|
||||
|
||||
emptyTmenv :: TermEnv
|
||||
emptyTmenv = Map.empty
|
||||
|
||||
instance Show Value where
|
||||
show (VInt n) = show n
|
||||
show (VBool n) = show n
|
||||
show VClosure{} = "<<closure>>"
|
||||
|
||||
eval :: TermEnv -> Expr -> Interpreter Value
|
||||
eval env expr = case expr of
|
||||
Var _ x -> do
|
||||
let Just v = Map.lookup x env
|
||||
return v
|
||||
|
||||
Lam _ x body ->
|
||||
return (VClosure x body env)
|
||||
|
||||
App _ fun arg -> do
|
||||
VClosure x body clo <- eval env fun
|
||||
argv <- eval env arg
|
||||
let nenv = Map.insert x argv clo
|
||||
eval nenv body
|
||||
|
||||
runEval :: TermEnv -> String -> Expr -> (Value, TermEnv)
|
||||
runEval env nm ex =
|
||||
let res = runIdentity (eval env ex) in
|
||||
(res, Map.insert nm res env)
|
205
chapter9/provenance/Infer.hs
Normal file
@ -0,0 +1,205 @@
|
||||
{-# LANGUAGE FlexibleInstances #-}
|
||||
{-# LANGUAGE FlexibleContexts #-}
|
||||
{-# LANGUAGE TypeSynonymInstances #-}
|
||||
{-# LANGUAGE GeneralizedNewtypeDeriving #-}
|
||||
|
||||
module Infer (
|
||||
inferTop,
|
||||
TypeError(..),
|
||||
Env,
|
||||
) where
|
||||
|
||||
import Control.Monad.Except
|
||||
import Control.Monad.Reader
|
||||
import Control.Monad.Writer
|
||||
import Control.Monad.State
|
||||
|
||||
import qualified Data.Map as Map
|
||||
import qualified Data.Set as Set
|
||||
|
||||
import Syntax
|
||||
import Type
|
||||
|
||||
setLoc :: Loc -> Type -> Type
|
||||
setLoc l (TVar _ a) = TVar l a
|
||||
setLoc l (TCon _ a) = TCon l a
|
||||
setLoc l (TArr _ a b) = TArr l a b
|
||||
|
||||
getLoc :: Type -> Loc
|
||||
getLoc (TVar l _) = l
|
||||
getLoc (TCon l _) = l
|
||||
getLoc (TArr l _ _) = l
|
||||
|
||||
-------------------------------------------------------------------------------
|
||||
-- Substitution
|
||||
-------------------------------------------------------------------------------
|
||||
|
||||
type Unifier = (Subst, [Constraint])
|
||||
type Constraint = (Type, Type)
|
||||
type Env = [(Name, Type)]
|
||||
|
||||
extend :: (Name, Type) -> Env -> Env
|
||||
extend xt env = xt : env
|
||||
|
||||
newtype Subst = Subst (Map.Map TVar Type)
|
||||
deriving (Eq, Ord, Show, Monoid)
|
||||
|
||||
class Substitutable a where
|
||||
apply :: Subst -> a -> a
|
||||
ftv :: a -> Set.Set TVar
|
||||
|
||||
instance Substitutable Type where
|
||||
apply _ (TCon l a) = TCon l a
|
||||
apply (Subst s) t@(TVar l a) = Map.findWithDefault t a s
|
||||
apply s (TArr l t1 t2) = TArr l (apply s t1) (apply s t2)
|
||||
|
||||
ftv TCon{} = Set.empty
|
||||
ftv (TVar _ a) = Set.singleton a
|
||||
ftv (TArr _ t1 t2) = ftv t1 `Set.union` ftv t2
|
||||
|
||||
instance Substitutable Constraint where
|
||||
apply s (t1, t2) = (apply s t1, apply s t2)
|
||||
ftv (t1, t2) = ftv t1 `Set.union` ftv t2
|
||||
|
||||
instance Substitutable a => Substitutable [a] where
|
||||
apply = map . apply
|
||||
ftv = foldr (Set.union . ftv) Set.empty
|
||||
|
||||
data TypeError
|
||||
= UnificationFail Type Loc Type Loc
|
||||
| InfiniteType TVar Type
|
||||
| UnboundVariable String
|
||||
| Ambigious [Constraint]
|
||||
| UnificationMismatch [Type] [Type]
|
||||
|
||||
-------------------------------------------------------------------------------
|
||||
-- Environment
|
||||
-------------------------------------------------------------------------------
|
||||
|
||||
inEnv :: (Name, Type) -> Check a -> Check a
|
||||
inEnv (x,t) = local (extend (x,t))
|
||||
|
||||
lookupVar :: Name -> Check Type
|
||||
lookupVar x = do
|
||||
env <- ask
|
||||
case lookup x env of
|
||||
Nothing -> throwError $ UnboundVariable x
|
||||
Just s -> return s
|
||||
|
||||
letters :: [String]
|
||||
letters = [1..] >>= flip replicateM ['a'..'z']
|
||||
|
||||
fresh :: Loc -> Check Type
|
||||
fresh l = do
|
||||
s <- get
|
||||
put s{count = count s + 1}
|
||||
return $ TVar l (TV (letters !! count s))
|
||||
|
||||
-------------------------------------------------------------------------------
|
||||
-- Type Checker
|
||||
-------------------------------------------------------------------------------
|
||||
|
||||
-- | Inference state
|
||||
data InferState = InferState { count :: Int }
|
||||
type Check =
|
||||
WriterT [Constraint]
|
||||
(StateT InferState
|
||||
(ExceptT TypeError (Reader Env)))
|
||||
|
||||
check :: Expr -> Check Type
|
||||
check expr = case expr of
|
||||
Var l n -> do
|
||||
t <- lookupVar n
|
||||
return $ setLoc l t
|
||||
|
||||
App l a b -> do
|
||||
ta <- check a
|
||||
tb <- check b
|
||||
tr <- fresh l
|
||||
unify ta (TArr l tb tr)
|
||||
return tr
|
||||
|
||||
Lam l n a -> do
|
||||
tv <- fresh l
|
||||
ty <- inEnv (n, tv) (check a)
|
||||
return (TArr l ty (setLoc l tv))
|
||||
|
||||
Lit l _ -> return $ TCon l "Int"
|
||||
|
||||
runCheck :: Env -> Check a -> Either TypeError (a, [Constraint])
|
||||
runCheck env =
|
||||
flip runReader env
|
||||
. runExceptT
|
||||
. flip evalStateT (InferState 0)
|
||||
. runWriterT
|
||||
|
||||
inferTop :: Env -> Expr -> Either TypeError Type
|
||||
inferTop env x = do
|
||||
(ty, cs) <- runCheck env (check x)
|
||||
s <- runSolve cs
|
||||
return (apply s ty)
|
||||
|
||||
-------------------------------------------------------------------------------
|
||||
-- Constraint Solving
|
||||
-------------------------------------------------------------------------------
|
||||
|
||||
type Solve = StateT Unifier (Except TypeError)
|
||||
|
||||
runSolve :: [Constraint] -> Either TypeError Subst
|
||||
runSolve cs = runExcept (evalStateT solver st)
|
||||
where st = (emptySubst, cs)
|
||||
|
||||
-- Unification solver
|
||||
solver :: Solve Subst
|
||||
solver = do
|
||||
(su, cs) <- get
|
||||
case cs of
|
||||
[] -> return su
|
||||
((t1, t2): cs0) -> do
|
||||
(su1, cs1) <- unifies t1 t2
|
||||
put (su1 `compose` su, cs1 ++ (apply su1 cs0))
|
||||
solver
|
||||
|
||||
-- | Empty unifier
|
||||
emptyUnifer :: Unifier
|
||||
emptyUnifer = (emptySubst, [])
|
||||
|
||||
-- | The empty substitution
|
||||
emptySubst :: Subst
|
||||
emptySubst = mempty
|
||||
|
||||
-- | Compose substitutions
|
||||
compose :: Subst -> Subst -> Subst
|
||||
(Subst s1) `compose` (Subst s2) = Subst $ Map.map (apply (Subst s1)) s2 `Map.union` s1
|
||||
|
||||
bind :: TVar -> Type -> Solve Unifier
|
||||
bind a t
|
||||
| eqLoc t a = return (emptySubst, [])
|
||||
| occursCheck a t = throwError $ InfiniteType a t
|
||||
| otherwise = return $ (Subst $ Map.singleton a t, [])
|
||||
|
||||
eqLoc :: Type -> TVar -> Bool
|
||||
eqLoc (TVar _ a) b = a == b
|
||||
eqLoc _ _ = False
|
||||
|
||||
occursCheck :: Substitutable a => TVar -> a -> Bool
|
||||
occursCheck a t = a `Set.member` ftv t
|
||||
|
||||
unifies :: Type -> Type -> Solve Unifier
|
||||
unifies t1 t2 | t1 == t2 = return emptyUnifer
|
||||
unifies (TVar _ v) t = v `bind` t
|
||||
unifies t (TVar _ v) = v `bind` t
|
||||
unifies (TArr _ t1 t2) (TArr _ t3 t4) = unifyMany [t1, t2] [t3, t4]
|
||||
unifies t1 t2 = throwError $ UnificationFail t1 (getLoc t1) t2 (getLoc t2)
|
||||
|
||||
unifyMany :: [Type] -> [Type] -> Solve Unifier
|
||||
unifyMany [] [] = return emptyUnifer
|
||||
unifyMany (t1 : ts1) (t2 : ts2) =
|
||||
do (su1,cs1) <- unifies t1 t2
|
||||
(su2,cs2) <- unifyMany (apply su1 ts1) (apply su1 ts2)
|
||||
return (su2 `compose` su1, cs1 ++ cs2)
|
||||
unifyMany t1 t2 = throwError $ UnificationMismatch t1 t2
|
||||
|
||||
-- | Unify two types
|
||||
unify :: Type -> Type -> Check ()
|
||||
unify t1 t2 = tell [(t1, t2)]
|
73
chapter9/provenance/Lexer.hs
Normal file
@ -0,0 +1,73 @@
|
||||
module Lexer where
|
||||
|
||||
import Text.Parsec
|
||||
import Text.Parsec.Text.Lazy
|
||||
import qualified Data.Text.Lazy as L
|
||||
import qualified Text.Parsec.Token as Tok
|
||||
import qualified Text.Parsec.Expr as Ex
|
||||
|
||||
import Data.Functor.Identity
|
||||
|
||||
type Op a = Ex.Operator L.Text () Identity a
|
||||
type Operators a = Ex.OperatorTable L.Text () Identity a
|
||||
|
||||
reservedNames :: [String]
|
||||
reservedNames = [
|
||||
"let",
|
||||
"in",
|
||||
"fix",
|
||||
"rec",
|
||||
"if",
|
||||
"then",
|
||||
"else"
|
||||
]
|
||||
|
||||
reservedOps :: [String]
|
||||
reservedOps = [
|
||||
"->",
|
||||
"\\",
|
||||
"+",
|
||||
"*",
|
||||
"-",
|
||||
"="
|
||||
]
|
||||
|
||||
lexer :: Tok.GenTokenParser L.Text () Identity
|
||||
lexer = Tok.makeTokenParser $ Tok.LanguageDef
|
||||
{ Tok.commentStart = "{-"
|
||||
, Tok.commentEnd = "-}"
|
||||
, Tok.commentLine = "--"
|
||||
, Tok.nestedComments = True
|
||||
, Tok.identStart = letter
|
||||
, Tok.identLetter = alphaNum <|> oneOf "_'"
|
||||
, Tok.opStart = oneOf ":!#$%&*+./<=>?@\\^|-~"
|
||||
, Tok.opLetter = oneOf ":!#$%&*+./<=>?@\\^|-~"
|
||||
, Tok.reservedNames = reservedNames
|
||||
, Tok.reservedOpNames = reservedOps
|
||||
, Tok.caseSensitive = True
|
||||
}
|
||||
|
||||
reserved :: String -> Parser ()
|
||||
reserved = Tok.reserved lexer
|
||||
|
||||
reservedOp :: String -> Parser ()
|
||||
reservedOp = Tok.reservedOp lexer
|
||||
|
||||
identifier :: Parser String
|
||||
identifier = Tok.identifier lexer
|
||||
|
||||
parens :: Parser a -> Parser a
|
||||
parens = Tok.parens lexer
|
||||
|
||||
semiSep :: Parser a -> Parser [a]
|
||||
semiSep = Tok.semiSep lexer
|
||||
|
||||
semi :: Parser String
|
||||
semi = Tok.semi lexer
|
||||
|
||||
contents :: Parser a -> Parser a
|
||||
contents p = do
|
||||
Tok.whiteSpace lexer
|
||||
r <- p
|
||||
eof
|
||||
return r
|
149
chapter9/provenance/Main.hs
Normal file
@ -0,0 +1,149 @@
|
||||
{-# LANGUAGE FlexibleContexts #-}
|
||||
{-# LANGUAGE FlexibleInstances #-}
|
||||
{-# LANGUAGE OverloadedStrings #-}
|
||||
{-# LANGUAGE TypeSynonymInstances #-}
|
||||
|
||||
import Syntax
|
||||
import Infer
|
||||
import Parser
|
||||
import Pretty
|
||||
|
||||
import Data.Monoid
|
||||
import qualified Data.Map as Map
|
||||
import qualified Data.Text.Lazy as L
|
||||
import qualified Data.Text.Lazy.IO as L
|
||||
|
||||
import Control.Monad.Identity
|
||||
import Control.Monad.State.Strict
|
||||
|
||||
import Data.List (isPrefixOf)
|
||||
|
||||
import System.Exit
|
||||
import System.Environment
|
||||
import System.Console.Repline
|
||||
|
||||
-------------------------------------------------------------------------------
|
||||
-- Types
|
||||
-------------------------------------------------------------------------------
|
||||
|
||||
data IState = IState
|
||||
{ tyctx :: Env -- Type environment
|
||||
}
|
||||
|
||||
initState :: IState
|
||||
initState = IState []
|
||||
|
||||
type Repl a = HaskelineT (StateT IState IO) a
|
||||
|
||||
hoistErr :: Show e => Either e a -> Repl a
|
||||
hoistErr (Right val) = return val
|
||||
hoistErr (Left err) = do
|
||||
liftIO $ print err
|
||||
abort
|
||||
|
||||
-------------------------------------------------------------------------------
|
||||
-- Execution
|
||||
-------------------------------------------------------------------------------
|
||||
|
||||
exec :: Bool -> L.Text -> Repl ()
|
||||
exec update source = do
|
||||
-- Get the current interpreter state
|
||||
st <- get
|
||||
|
||||
-- Parser ( returns AST )
|
||||
mod <- hoistErr $ parseExpr source
|
||||
|
||||
-- Type Inference ( returns Typing Environment )
|
||||
tyctx' <- hoistErr $ inferTop (tyctx st) mod
|
||||
liftIO $ putStrLn (pptype tyctx')
|
||||
return ()
|
||||
|
||||
showOutput :: String -> IState -> Repl ()
|
||||
showOutput arg st = do
|
||||
case lookup "it" (tyctx st) of
|
||||
Just val -> liftIO $ putStrLn $ ppsignature (arg, val)
|
||||
Nothing -> return ()
|
||||
|
||||
cmd :: String -> Repl ()
|
||||
cmd source = exec True (L.pack source)
|
||||
|
||||
-------------------------------------------------------------------------------
|
||||
-- Commands
|
||||
-------------------------------------------------------------------------------
|
||||
|
||||
-- :browse command
|
||||
browse :: [String] -> Repl ()
|
||||
browse _ = do
|
||||
st <- get
|
||||
undefined
|
||||
{-liftIO $ mapM_ putStrLn $ ppenv (tyctx st)-}
|
||||
|
||||
-- :load command
|
||||
load :: [String] -> Repl ()
|
||||
load args = do
|
||||
contents <- liftIO $ L.readFile (unwords args)
|
||||
exec True contents
|
||||
|
||||
-- :type command
|
||||
typeof :: [String] -> Repl ()
|
||||
typeof args = do
|
||||
st <- get
|
||||
let arg = unwords args
|
||||
case lookup arg (tyctx st) of
|
||||
Just val -> liftIO $ putStrLn $ ppsignature (arg, val)
|
||||
Nothing -> exec False (L.pack arg)
|
||||
|
||||
-- :quit command
|
||||
quit :: a -> Repl ()
|
||||
quit _ = liftIO $ exitSuccess
|
||||
|
||||
-------------------------------------------------------------------------------
|
||||
-- Interactive Shell
|
||||
-------------------------------------------------------------------------------
|
||||
|
||||
-- Prefix tab completer
|
||||
defaultMatcher :: MonadIO m => [(String, CompletionFunc m)]
|
||||
defaultMatcher = [
|
||||
(":load" , fileCompleter)
|
||||
--, (":type" , values)
|
||||
]
|
||||
|
||||
-- Default tab completer
|
||||
comp :: (Monad m, MonadState IState m) => WordCompleter m
|
||||
comp n = do
|
||||
let cmds = [":load", ":type", ":browse", ":quit"]
|
||||
ctx <- gets tyctx
|
||||
let defs = fmap fst ctx
|
||||
return $ filter (isPrefixOf n) (cmds ++ defs)
|
||||
|
||||
options :: [(String, [String] -> Repl ())]
|
||||
options = [
|
||||
("load" , load)
|
||||
, ("browse" , browse)
|
||||
, ("quit" , quit)
|
||||
, ("type" , Main.typeof)
|
||||
]
|
||||
|
||||
-------------------------------------------------------------------------------
|
||||
-- Entry Point
|
||||
-------------------------------------------------------------------------------
|
||||
|
||||
completer :: CompleterStyle (StateT IState IO)
|
||||
completer = Prefix (wordCompleter comp) defaultMatcher
|
||||
|
||||
shell :: Repl a -> IO ()
|
||||
shell pre = flip evalStateT initState
|
||||
$ evalRepl "Poly> " cmd options completer pre
|
||||
|
||||
-------------------------------------------------------------------------------
|
||||
-- Toplevel
|
||||
-------------------------------------------------------------------------------
|
||||
|
||||
main :: IO ()
|
||||
main = do
|
||||
args <- getArgs
|
||||
case args of
|
||||
[] -> shell (return ())
|
||||
[fname] -> shell (load [fname])
|
||||
["test", fname] -> shell (load [fname] >> browse [] >> quit ())
|
||||
_ -> putStrLn "invalid arguments"
|
77
chapter9/provenance/Parser.hs
Normal file
@ -0,0 +1,77 @@
|
||||
{-# LANGUAGE OverloadedStrings #-}
|
||||
|
||||
module Parser (
|
||||
parseExpr,
|
||||
parseModule
|
||||
) where
|
||||
|
||||
import Text.Parsec
|
||||
import Text.Parsec.Text.Lazy (Parser)
|
||||
|
||||
import qualified Text.Parsec.Expr as Ex
|
||||
import qualified Text.Parsec.Token as Tok
|
||||
|
||||
import qualified Data.Text.Lazy as L
|
||||
|
||||
import Lexer
|
||||
import Syntax
|
||||
|
||||
import Control.Applicative ((<$>))
|
||||
|
||||
integer :: Parser Integer
|
||||
integer = Tok.integer lexer
|
||||
|
||||
variable :: Parser Expr
|
||||
variable = do
|
||||
x <- identifier
|
||||
l <- sourceLine <$> getPosition
|
||||
return (Var (Located l) x)
|
||||
|
||||
number :: Parser Expr
|
||||
number = do
|
||||
n <- integer
|
||||
l <- sourceLine <$> getPosition
|
||||
return (Lit (Located l) (fromIntegral n))
|
||||
|
||||
lambda :: Parser Expr
|
||||
lambda = do
|
||||
reservedOp "\\"
|
||||
args <- many identifier
|
||||
reservedOp "->"
|
||||
body <- expr
|
||||
l <- sourceLine <$> getPosition
|
||||
return $ foldr (Lam (Located l)) body args
|
||||
|
||||
aexp :: Parser Expr
|
||||
aexp = parens expr
|
||||
<|> lambda
|
||||
<|> number
|
||||
<|> variable
|
||||
|
||||
expr :: Parser Expr
|
||||
expr = do
|
||||
es <- many1 aexp
|
||||
l <- sourceLine <$> getPosition
|
||||
return (foldl1 (App (Located l)) es)
|
||||
|
||||
type Binding = (String, Expr)
|
||||
|
||||
val :: Parser Binding
|
||||
val = do
|
||||
ex <- expr
|
||||
return ("it", ex)
|
||||
|
||||
top :: Parser Binding
|
||||
top = do
|
||||
x <- val
|
||||
optional semi
|
||||
return x
|
||||
|
||||
modl :: Parser [Binding]
|
||||
modl = many top
|
||||
|
||||
parseExpr :: L.Text -> Either ParseError Expr
|
||||
parseExpr input = parse (contents expr) "<stdin>" input
|
||||
|
||||
parseModule :: FilePath -> L.Text -> Either ParseError [(String, Expr)]
|
||||
parseModule fname input = parse (contents modl) fname input
|
82
chapter9/provenance/Pretty.hs
Normal file
@ -0,0 +1,82 @@
|
||||
{-# Language FlexibleInstances #-}
|
||||
{-# LANGUAGE OverloadedStrings #-}
|
||||
{-# Language TypeSynonymInstances #-}
|
||||
|
||||
module Pretty (
|
||||
ppdecl,
|
||||
ppexpr,
|
||||
ppsignature,
|
||||
pptype
|
||||
) where
|
||||
|
||||
import Type
|
||||
import Syntax
|
||||
import Infer
|
||||
|
||||
import Text.PrettyPrint
|
||||
|
||||
parensIf :: Bool -> Doc -> Doc
|
||||
parensIf True = parens
|
||||
parensIf False = id
|
||||
|
||||
class Pretty p where
|
||||
ppr :: Int -> p -> Doc
|
||||
pp :: p -> Doc
|
||||
pp = ppr 0
|
||||
|
||||
instance Pretty Name where
|
||||
ppr _ x = text x
|
||||
|
||||
instance Pretty TVar where
|
||||
ppr _ (TV x) = text x
|
||||
|
||||
instance Pretty Type where
|
||||
ppr p (TArr _ a b) = (parensIf (isArrow a) (ppr p a)) <+> text "->" <+> ppr p b
|
||||
where
|
||||
isArrow TArr{} = True
|
||||
isArrow _ = False
|
||||
ppr p (TVar _ a) = ppr p a
|
||||
ppr _ (TCon _ a) = text a
|
||||
|
||||
instance Pretty Expr where
|
||||
ppr p (Var _ a) = ppr p a
|
||||
ppr p (App _ a b) = parensIf (p > 0) $ ppr (p+1) a <+> ppr p b
|
||||
ppr p (Lam _ a b) = text "\\" <> ppr p a <+> text "->" <+> ppr p b
|
||||
ppr _ (Lit _ a) = int a
|
||||
|
||||
instance Pretty Loc where
|
||||
ppr p (NoLoc) = ""
|
||||
ppr p (Located n) = int n
|
||||
|
||||
instance Show TypeError where
|
||||
show (UnificationFail a la b lb) =
|
||||
concat [
|
||||
"Cannot unify types: \n\t"
|
||||
, pptype a
|
||||
, "\n\tIntroduced at: "
|
||||
, (pploc la)
|
||||
, "\nwith \n\t"
|
||||
, pptype b
|
||||
, "\n\tIntroduced at: "
|
||||
, (pploc lb)
|
||||
]
|
||||
show (InfiniteType (TV a) b) =
|
||||
concat ["Cannot construct the the infinite type: ", a, " = ", pptype b]
|
||||
show (Ambigious cs) =
|
||||
concat ["Cannot not match expected type: '" ++ pptype a ++ "' with actual type: '" ++ pptype b ++ "'\n" | (a,b) <- cs]
|
||||
show (UnboundVariable a) = "Not in scope: " ++ a
|
||||
|
||||
pploc :: Loc -> String
|
||||
pploc = render . ppr 0
|
||||
|
||||
pptype :: Type -> String
|
||||
pptype = render . ppr 0
|
||||
|
||||
ppexpr :: Expr -> String
|
||||
ppexpr = render . ppr 0
|
||||
|
||||
ppsignature :: (String, Type) -> String
|
||||
ppsignature (a, b) = a ++ " : " ++ pptype b
|
||||
|
||||
ppdecl :: (String, Expr) -> String
|
||||
ppdecl (a, b) = "let " ++ a ++ " = " ++ ppexpr b
|
15
chapter9/provenance/Syntax.hs
Normal file
@ -0,0 +1,15 @@
|
||||
module Syntax (
|
||||
Expr(..),
|
||||
Name,
|
||||
Loc(..),
|
||||
) where
|
||||
|
||||
type Name = String
|
||||
data Loc = NoLoc | Located Int
|
||||
deriving (Show, Eq, Ord)
|
||||
|
||||
data Expr
|
||||
= Var Loc Name
|
||||
| App Loc Expr Expr
|
||||
| Lam Loc Name Expr
|
||||
| Lit Loc Int
|
15
chapter9/provenance/Type.hs
Normal file
@ -0,0 +1,15 @@
|
||||
module Type (
|
||||
Type(..),
|
||||
TVar(..),
|
||||
) where
|
||||
|
||||
import Syntax (Loc, Name)
|
||||
|
||||
data Type
|
||||
= TVar Loc TVar
|
||||
| TCon Loc Name
|
||||
| TArr Loc Type Type
|
||||
deriving (Show, Eq, Ord)
|
||||
|
||||
newtype TVar = TV String
|
||||
deriving (Show, Eq, Ord)
|
@ -1,3 +1,6 @@
|
||||
Contributing
|
||||
============
|
||||
|
||||
As always, I rely on the perpetual kindness and goodwill of Haskellers (like
|
||||
you!) to help correct grammar, clarify, and fix errors.
|
||||
|
||||
@ -11,6 +14,10 @@ repo and then submit a pull request on Github. There should be no need to
|
||||
compile the text locally. I will try to merge the changes quickly and rebuild
|
||||
the text daily.
|
||||
|
||||
If you would like to add your name to
|
||||
[CONTRIBUTORS.md](https://github.com/sdiehl/write-you-a-haskell/blob/master/CONTRIBUTORS.md)
|
||||
submit this along with your pull request.
|
||||
|
||||
*Complex Fixes*
|
||||
|
||||
If you'd like to submit a change to the publishing software around the text,
|
||||
@ -59,6 +66,28 @@ syntax highlighting.
|
||||
|
||||
*Math Typesetting*
|
||||
|
||||
Equations can be included in display form:
|
||||
|
||||
```latex
|
||||
$$
|
||||
\int_\Omega \mathrm{d}\omega = \oint_{\partial \Omega} \omega
|
||||
$$
|
||||
```
|
||||
|
||||
$$
|
||||
\int_\Omega \mathrm{d}\omega = \oint_{\partial \Omega} \omega
|
||||
$$
|
||||
|
||||
Or in inline form (like $a^2 + b^2 = c^2$) with single dollar signs. Specially
|
||||
there must be no spaces around the dollar signs otherwise Pandoc will not parse
|
||||
it properly.
|
||||
|
||||
```latex
|
||||
$a^2 + b^2 = c^2$
|
||||
```
|
||||
|
||||
For most definitions, the ``aligned`` block is used:
|
||||
|
||||
```latex
|
||||
$$
|
||||
\begin{aligned}
|
||||
@ -105,7 +134,7 @@ Typography
|
||||
LaTeX
|
||||
-----
|
||||
|
||||
The $\latex$ styling is sourced from the ``template.latex`` file, which is an
|
||||
The LaTeX styling is sourced from the ``template.latex`` file, which is an
|
||||
extension of Pandoc's default template with some custom modifications.
|
||||
|
||||
Images
|
||||
|
@ -23,6 +23,18 @@ pre code {
|
||||
font: 15px/19px Inconsolata, Monaco,"Lucida Console",Terminal,"Courier New",Courier;
|
||||
}
|
||||
|
||||
.figure {
|
||||
text-align: center;
|
||||
}
|
||||
|
||||
.pagetitle .figure {
|
||||
text-align: left !important;
|
||||
}
|
||||
|
||||
.pagetitle .figure img {
|
||||
height: 36px;
|
||||
}
|
||||
|
||||
table th {
|
||||
border-right: 1em solid transparent;
|
||||
}
|
||||
|
96
img/Haskell-Logo.ps
Normal file
@ -0,0 +1,96 @@
|
||||
%!PS-Adobe-3.0
|
||||
%%Creator: cairo 1.10.2 (http://cairographics.org)
|
||||
%%CreationDate: Wed Jan 14 11:07:43 2015
|
||||
%%Pages: 1
|
||||
%%BoundingBox: 0 0 490 349
|
||||
%%DocumentData: Clean7Bit
|
||||
%%LanguageLevel: 2
|
||||
%%DocumentMedia: 173x123mm 490 348 0 () ()
|
||||
%%EndComments
|
||||
%%BeginProlog
|
||||
/languagelevel where
|
||||
{ pop languagelevel } { 1 } ifelse
|
||||
2 lt { /Helvetica findfont 12 scalefont setfont 50 500 moveto
|
||||
(This print job requires a PostScript Language Level 2 printer.) show
|
||||
showpage quit } if
|
||||
/q { gsave } bind def
|
||||
/Q { grestore } bind def
|
||||
/cm { 6 array astore concat } bind def
|
||||
/w { setlinewidth } bind def
|
||||
/J { setlinecap } bind def
|
||||
/j { setlinejoin } bind def
|
||||
/M { setmiterlimit } bind def
|
||||
/d { setdash } bind def
|
||||
/m { moveto } bind def
|
||||
/l { lineto } bind def
|
||||
/c { curveto } bind def
|
||||
/h { closepath } bind def
|
||||
/re { exch dup neg 3 1 roll 5 3 roll moveto 0 rlineto
|
||||
0 exch rlineto 0 rlineto closepath } bind def
|
||||
/S { stroke } bind def
|
||||
/f { fill } bind def
|
||||
/f* { eofill } bind def
|
||||
/n { newpath } bind def
|
||||
/W { clip } bind def
|
||||
/W* { eoclip } bind def
|
||||
/BT { } bind def
|
||||
/ET { } bind def
|
||||
/pdfmark where { pop globaldict /?pdfmark /exec load put }
|
||||
{ globaldict begin /?pdfmark /pop load def /pdfmark
|
||||
/cleartomark load def end } ifelse
|
||||
/BDC { mark 3 1 roll /BDC pdfmark } bind def
|
||||
/EMC { mark /EMC pdfmark } bind def
|
||||
/cairo_store_point { /cairo_point_y exch def /cairo_point_x exch def } def
|
||||
/Tj { show currentpoint cairo_store_point } bind def
|
||||
/TJ {
|
||||
{
|
||||
dup
|
||||
type /stringtype eq
|
||||
{ show } { -0.001 mul 0 cairo_font_matrix dtransform rmoveto } ifelse
|
||||
} forall
|
||||
currentpoint cairo_store_point
|
||||
} bind def
|
||||
/cairo_selectfont { cairo_font_matrix aload pop pop pop 0 0 6 array astore
|
||||
cairo_font exch selectfont cairo_point_x cairo_point_y moveto } bind def
|
||||
/Tf { pop /cairo_font exch def /cairo_font_matrix where
|
||||
{ pop cairo_selectfont } if } bind def
|
||||
/Td { matrix translate cairo_font_matrix matrix concatmatrix dup
|
||||
/cairo_font_matrix exch def dup 4 get exch 5 get cairo_store_point
|
||||
/cairo_font where { pop cairo_selectfont } if } bind def
|
||||
/Tm { 2 copy 8 2 roll 6 array astore /cairo_font_matrix exch def
|
||||
cairo_store_point /cairo_font where { pop cairo_selectfont } if } bind def
|
||||
/g { setgray } bind def
|
||||
/rg { setrgbcolor } bind def
|
||||
/d1 { setcachedevice } bind def
|
||||
%%EndProlog
|
||||
%%Page: 1 1
|
||||
%%BeginPageSetup
|
||||
%%PageMedia: 173x123mm
|
||||
%%PageBoundingBox: 0 0 490 349
|
||||
%%EndPageSetup
|
||||
q 0 0 490 349 rectclip q
|
||||
1 g
|
||||
8 w
|
||||
0 J
|
||||
1 j
|
||||
[] 0.0 d
|
||||
4 M q 1 0 0 -1 0 348.156311 cm
|
||||
4.016 344.156 m 117.402 174.078 l 4.016 4 l 89.055 4 l 202.441 174.078
|
||||
l 89.055 344.156 l h
|
||||
4.016 344.156 m S Q
|
||||
q 1 0 0 -1 0 348.156311 cm
|
||||
117.402 344.156 m 230.789 174.078 l 117.402 4 l 202.441 4 l 429.211
|
||||
344.156 l 344.172 344.156 l 273.309 237.859 l 202.441 344.156 l h
|
||||
117.402 344.156 m S Q
|
||||
q 1 0 0 -1 0 348.156311 cm
|
||||
391.418 244.945 m 353.625 188.254 l 485.906 188.25 l 485.906 244.945 l
|
||||
h
|
||||
391.418 244.945 m S Q
|
||||
q 1 0 0 -1 0 348.156311 cm
|
||||
334.727 159.906 m 296.93 103.215 l 485.906 103.211 l 485.906 159.906 l
|
||||
h
|
||||
334.727 159.906 m S Q
|
||||
Q Q
|
||||
showpage
|
||||
%%Trailer
|
||||
%%EOF
|
BIN
img/lambda.png
Before Width: | Height: | Size: 4.5 KiB After Width: | Height: | Size: 20 KiB |
@ -1,4 +1,5 @@
|
||||
digraph G {
|
||||
graph [ dpi = 300 ];
|
||||
rankdir=LR
|
||||
node [shape=box]
|
||||
Source -> Parsing -> Desugar -> "Type Checking" -> Transformation -> Compilation
|
||||
|
Before Width: | Height: | Size: 6.6 KiB After Width: | Height: | Size: 24 KiB |
@ -1,4 +1,5 @@
|
||||
digraph G {
|
||||
graph [ dpi = 300 ];
|
||||
rankdir=LR
|
||||
node [shape=box]
|
||||
Frontend -> "Core Language" -> "Compiler IR" -> "Machine Code"
|
||||
|
Before Width: | Height: | Size: 5.2 KiB After Width: | Height: | Size: 19 KiB |
@ -1,4 +1,5 @@
|
||||
digraph G {
|
||||
graph [ dpi = 300 ];
|
||||
rankdir=LR
|
||||
node [shape=box]
|
||||
Source -> Parsing -> Desugar -> Inference -> Transformation -> Compliation
|
||||
|
@ -1,4 +1,5 @@
|
||||
digraph G {
|
||||
graph [ dpi = 300 ];
|
||||
rankdir=LR
|
||||
node [shape=box]
|
||||
Parse -> Rename -> Typecheck -> Desugar -> ToCore -> Evaluate
|
||||
|
Before Width: | Height: | Size: 6.3 KiB After Width: | Height: | Size: 23 KiB |
@ -1,4 +1,5 @@
|
||||
digraph G {
|
||||
graph [ dpi = 300 ];
|
||||
rankdir=LR
|
||||
node [shape=box]
|
||||
Source -> Frontend -> Core -> PHOAS
|
||||
|
Before Width: | Height: | Size: 4.4 KiB After Width: | Height: | Size: 15 KiB |
@ -1,4 +1,5 @@
|
||||
digraph G {
|
||||
graph [ dpi = 72 ];
|
||||
rankdir=TB
|
||||
node [shape=box]
|
||||
WriterT -> IO [label = " execWriterT"]
|
||||
|
BIN
img/stack.png
Before Width: | Height: | Size: 8.8 KiB After Width: | Height: | Size: 5.5 KiB |
Before Width: | Height: | Size: 4.5 KiB After Width: | Height: | Size: 17 KiB |
Before Width: | Height: | Size: 3.2 KiB After Width: | Height: | Size: 11 KiB |
Before Width: | Height: | Size: 5.4 KiB After Width: | Height: | Size: 20 KiB |
Before Width: | Height: | Size: 6.5 KiB After Width: | Height: | Size: 24 KiB |
Before Width: | Height: | Size: 3.6 KiB After Width: | Height: | Size: 13 KiB |
Before Width: | Height: | Size: 5.4 KiB After Width: | Height: | Size: 20 KiB |
BIN
img/titles/llvm.png
Normal file
After Width: | Height: | Size: 4.9 KiB |
Before Width: | Height: | Size: 2.8 KiB After Width: | Height: | Size: 9.8 KiB |
Before Width: | Height: | Size: 4.1 KiB After Width: | Height: | Size: 15 KiB |
BIN
img/titles/systemf.png
Normal file
After Width: | Height: | Size: 12 KiB |
Before Width: | Height: | Size: 5.1 KiB After Width: | Height: | Size: 19 KiB |
104
template.latex
@ -1,5 +1,16 @@
|
||||
\documentclass[$if(fontsize)$$fontsize$,$endif$$if(lang)$$lang$,$endif$$if(papersize)$$papersize$,$endif$$for(classoption)$$classoption$$sep$,$endfor$]{$documentclass$}
|
||||
\usepackage{geometry}
|
||||
\usepackage{xcolor}
|
||||
\usepackage{graphicx}
|
||||
\usepackage[labelformat=empty]{caption}
|
||||
\usepackage{afterpage}
|
||||
|
||||
\newcommand\blankpage{%
|
||||
\null
|
||||
\thispagestyle{empty}%
|
||||
\addtocounter{page}{-1}%
|
||||
\newpage}
|
||||
|
||||
$if(fontfamily)$
|
||||
\usepackage{$fontfamily$}
|
||||
$else$
|
||||
@ -34,11 +45,20 @@ $if(sansfont)$
|
||||
\setsansfont{$sansfont$}
|
||||
$endif$
|
||||
$if(monofont)$
|
||||
\setmonofont[Mapping=tex-ansi]{$monofont$}
|
||||
%\setmonofont[Mapping=tex-ansi]{$monofont$}
|
||||
% custom override
|
||||
$endif$
|
||||
|
||||
$if(mathfont)$
|
||||
\setmathfont(Digits,Latin,Greek){$mathfont$}
|
||||
$endif$
|
||||
|
||||
\usepackage{fontspec}
|
||||
\setmainfont[Ligatures=Common,
|
||||
ItalicFont={Adobe Garamond Pro Italic}]
|
||||
{Adobe Garamond Pro}
|
||||
\setmonofont[Ligatures=NoCommon]{Source Code Pro}
|
||||
|
||||
\fi
|
||||
% use upquote if available, for straight quotes in verbatim environments
|
||||
\IfFileExists{upquote.sty}{\usepackage{upquote}}{}
|
||||
@ -106,7 +126,6 @@ $if(graphics)$
|
||||
\fi
|
||||
}
|
||||
\makeatother
|
||||
|
||||
\setkeys{Gin}{width=\ScaleWidthIfNeeded,height=\ScaleHeightIfNeeded,keepaspectratio}%
|
||||
$endif$
|
||||
\ifxetex
|
||||
@ -160,11 +179,88 @@ $endfor$
|
||||
|
||||
\begin{document}
|
||||
$if(title)$
|
||||
% ----------
|
||||
% Title page
|
||||
% ----------
|
||||
|
||||
\begin{titlepage}
|
||||
|
||||
\definecolor{titlepagecolor}{cmyk}{1,.60,0,.40}
|
||||
\definecolor{namecolor}{cmyk}{1,.50,0,.10}
|
||||
\newgeometry{left=7.5cm} %defines the geometry for the titlepage
|
||||
\pagecolor{titlepagecolor}
|
||||
\begin{figure}
|
||||
\centering
|
||||
\includegraphics[width=4cm]{img/Haskell-Logo.png}
|
||||
\includegraphics[height=3in,width=3in]{img/Haskell-Logo.ps}
|
||||
\end{figure}
|
||||
\maketitle
|
||||
|
||||
\color{white}
|
||||
\makebox[0pt][l]{\rule{1.3\textwidth}{1pt}}
|
||||
\par
|
||||
\noindent
|
||||
{\huge \textsf{Write You a Haskell}}
|
||||
\par
|
||||
\noindent
|
||||
{\textit{\textsf{Building a modern functional compiler from first principles}}}
|
||||
%\textbf{\textsf{Something}} \textcolor{namecolor}{\textsf{Else}}
|
||||
\vfill
|
||||
\noindent
|
||||
%{\huge \textsf{Write You a Haskell}}
|
||||
\vskip\baselineskip
|
||||
\noindent
|
||||
{\huge \textsf{Stephen Diehl}}
|
||||
\\
|
||||
\textsf{January 2015 (Draft)}
|
||||
|
||||
\end{titlepage}
|
||||
\pagecolor{white}
|
||||
\restoregeometry % restores the geometry
|
||||
|
||||
% Filler page
|
||||
%\null
|
||||
%\thispagestyle{empty}
|
||||
%\addtocounter{page}{-1}
|
||||
%\newpage
|
||||
|
||||
% Subtitle page
|
||||
%\vfill{3in}
|
||||
%\begin{centering}
|
||||
%{\HUGE \textsf{Write You a Haskell}}
|
||||
%\end{centering}
|
||||
|
||||
%\thispagestyle{empty}
|
||||
%\addtocounter{page}{-1}
|
||||
%\newpage
|
||||
|
||||
% Copyright page
|
||||
\thispagestyle{empty}
|
||||
\addtocounter{page}{-1}
|
||||
|
||||
\begin{minipage}[b]{0.9\textwidth}
|
||||
\footnotesize\raggedright
|
||||
\setlength{\parskip}{0.5\baselineskip}
|
||||
|
||||
\begin{textsf}
|
||||
{\textbf{Write You a Haskell}}\\
|
||||
by Stephen Diehl
|
||||
|
||||
\par
|
||||
Copyright \copyright\ 2013-2015. \\
|
||||
\href{http://www.stephendiehl.com}{www.stephendiehl.com}
|
||||
|
||||
\par
|
||||
This written work is licensed under a Creative Commons
|
||||
Attribution-NonCommercial-ShareAlike 4.0 International License. You may
|
||||
reproduce and edit this work with attribution for all non-commercial purposes.
|
||||
\par
|
||||
The included source is released under the terms of the MIT License.
|
||||
\par
|
||||
Git commit: f09d210be253a05fc8ad0827cd72ffa32404e2ba
|
||||
\end{textsf}
|
||||
\end{minipage}
|
||||
\vspace*{2\baselineskip}
|
||||
\cleardoublepage
|
||||
|
||||
$endif$
|
||||
$if(abstract)$
|
||||
\begin{abstract}
|
||||
|