diff --git a/000_introduction.md b/000_introduction.md index 5cff34a..e73aece 100644 --- a/000_introduction.md +++ b/000_introduction.md @@ -1,14 +1,19 @@ +
![](img/titles/introduction.png) +
-****** +

+

-> When the limestone of imperative programming is worn away, the granite of -> functional programming will be observed. + Introduction ============ @@ -151,42 +156,71 @@ refines the space of allowable behavior and degree of expressible programs for the language. Types are the world's most popular formal method for analyzing programs. -$$ -\begin{aligned} -1 &: \t{Nat} \\ -(\lambda x . x) &: \forall a. a \to a \\ -(\lambda x y . x) &: \forall a b. a \to b \to a \\ -\end{aligned} -$$ +In a language like Python all expressions have the same type at compile time, +and all syntactically valid programs can be evaluated. In the case where the +program is nonsensical the runtime will bubble up exceptions at runtime. The +Python interpreter makes no attempt to analyze the given program for soundness +at all before running it. -In more sophisticated languages types and terms will commingle either with -explicit annotations on binders, or even as first class values themselves. +```bash +>>> True & "false" +Traceback (most recent call last): + File "", line 1, in +TypeError: unsupported operand type(s) for &: 'bool' and 'str' +``` -$$ -\t{Pair} \ u \ v = \Lambda X . \lambda x^{U \rightarrow V \rightarrow X} . x u v -$$ +By comparison Haskell will do quite a bit of work to try to ensure that the +program is well-defined before running it. The language that we use to +predescribe and analyze static semantics of the program is that of *static +types*. -In all the languages which we will implement the types present during compilation are -*erased*. Although they are present in the evaluation semantics, the runtime -cannot dispatch on types of values at runtime. Types by definition only exist at -compile-time in the static semantics of the language. +```bash +Prelude> True && "false" + +:2:9: + Couldn't match expected type `Bool' with actual type `[Char]' + In the second argument of `(&&)', namely `"false"' + In the expression: True && "false" + In an equation for `it': it = True && "false" +``` + +Catching minor type mismatch errors is the simplest example of usage, although +they occur extremely frequently as we humans are quite fallible in our reasoning +about even the simplest of program constructions! Although this just the tip of +the iceberg, the gradual trend over the last 20 years toward more *expressive +types* in modern type systems; which are capable of guaranteeing a large variety +of program correctness properties. + +* Preventing resource allocation errors. +* Enforcing security access for program logic. +* Side effect management. +* Preventing buffer overruns. +* Ensuring cryptographic properties for network protocols. +* Modeling and verify theorems in mathematics and logic. +* Preventing data races and deadlocks in concurrent systems. + +Type systems can never capture all possible behavior of the program. Although +more sophisticated type systems are increasingly able to model a large space of +behavior and is one of the most exciting areas of modern computer science +research. Put most bluntly, **static types let you be dumb** and offload the +checking that you would otherwise have to do in your head to a system that can +do the reasoning for you and work with you to interactively build your program. Functional Compilers -------------------- -A compiler is typically divided into parts, a *frontend* and a *backend*. These -are loose terms but the frontend typically deals with converting the human -representation of the code into some canonicalized form while the backend -converts the canonicalized form into another form that is suitable for -evaluation. +A *compiler* is a program for turning high-level representation of ideas in a +human readable language into another form. A compiler is typically divided into +parts, a *frontend* and a *backend*. These are loose terms but the frontend +typically deals with converting the human representation of the code into some +canonicalized form while the backend converts the canonicalized form into +another form that is suitable for evaluation. The high level structure of our functional compiler is described by the following *block diagram*. Each describes a *phase* which is a sequence of transformations composed to transform the input program. -

![](img/pipeline1.png) -

* **Source** - The frontend textual source language. * **Parsing** - Source is parsed into an abstract syntax tree. @@ -200,15 +234,13 @@ A *pass* may transform the input program from one form into another or alter the internal state of the compiler context. The high level description of the forms our final compiler will go through is the following sequence: -

![](img/pipeline2.png) -

Internal forms used during compilation are *intermediate representations* and typically any non-trivial language will involve several. -Lexing ------- +Parsing +------- The source code is simply the raw sequence of text that specifies the program. Lexing splits the text stream into a sequence of *tokens*. Only the presence of @@ -222,22 +254,22 @@ let f x = x + 1 For instance the previous program might generate a token stream like the following: -Token Value ------ ----- -reserved let -var f -var x -reservedOp = -var x -reservedOp + -integer 1 +```haskell +[ + TokenLet, + TokenSym "f", + TokenSym "x", + TokenEq, + TokenSym "x", + TokenAdd, + TokenNum 1 +] +``` -Parsing -------- - -A datatype for the *abstract syntax tree* (AST) is constructed by traversal of -the input stream and generation of the appropriate syntactic construct using a -parser. +We can then scan the token stream via and dispatch on predefined patterns of +tokens called *productions* and recursively build up a datatype for the +*abstract syntax tree* (AST) by traversal of the input stream and generation of +the appropriate syntactic. ```haskell type Name = String @@ -319,6 +351,19 @@ Let "f" [] (Lit (LitInt 1)))) ``` +Transformation +-------------- + +The type core representation is often suitable for evaluation, but quite often +different intermediate representations are more amenable to certain +optimizations and make explicit semantic properties of the language explicit. +These kind of intermediate forms will often attach information about free +variables, allocations, and usage information directly onto the AST to make it + +The most important form we will use is called the *Spineless Tagless G-Machine* +( STG ), an abstract machine that makes many of the properties of lazy +evaluation explicit directly in the AST. + Code Generation --------------- @@ -356,11 +401,11 @@ resulting module. ```perl f: - movl %edi, -4(%rsp) - movl -4(%rsp), %edi - addl $1, %edi - movl %edi, %eax - ret + movl %edi, -4(%rsp) + movl -4(%rsp), %edi + addl $1, %edi + movl %edi, %eax + ret ``` @@ -370,11 +415,11 @@ instructions defined by the processor specification. ```perl 0000000000000000 : - 0: 89 7c 24 fc mov %edi,-0x4(%rsp) - 4: 8b 7c 24 fc mov -0x4(%rsp),%edi - 8: 81 c7 01 00 00 00 add $0x1,%edi - e: 89 f8 mov %edi,%eax - 10: c3 retq + 0: 89 7c 24 fc mov %edi,-0x4(%rsp) + 4: 8b 7c 24 fc mov -0x4(%rsp),%edi + 8: 81 c7 01 00 00 00 add $0x1,%edi + e: 89 f8 mov %edi,%eax + 10: c3 retq ``` \pagebreak diff --git a/001_basics.md b/001_basics.md index 69cb384..47730e3 100644 --- a/001_basics.md +++ b/001_basics.md @@ -1,6 +1,6 @@ +
![](img/titles/basics.png) - -****** +

@@ -19,14 +18,15 @@ composition, function abstraction of a single variable. The **lambda calculus** consists very simply of three terms and all valid recursive combinations thereof: +![](img/lambda.png) + +The terms are named are typically referred to in code by the following +contractions. + - **Var** - A variable - **Lam** - A lambda abstraction - **App** - An application -

-![](img/lambda.png) -

- $$ \begin{aligned} e :=\ & x & \trule{Var} \\ @@ -183,15 +183,8 @@ choose the Haskell convention which denotes lambda by the backslash (``\``) to the body with (``->``), and application by spaces. Named variables are simply alphanumeric sequences of characters. -Logical notation: - -$\mathtt{const} = \lambda x y . x$ - -Haskell notation: - -```haskell -const = \x y -> x -``` +* **Logical notation**: $\mathtt{const} = \lambda x y . x$ +* **Haskell notation**: ``const = \x y -> x`` In addition other terms like literal numbers or booleans can be added, and these make writing expository examples a little easier. In addition we will add a diff --git a/004_type_systems.md b/004_type_systems.md index 28f0c9e..f55fdfe 100644 --- a/004_type_systems.md +++ b/004_type_systems.md @@ -1,12 +1,15 @@ +
![](img/titles/type_systems.png) +
-****** +

+

-> [A type system is a] tractable syntactic method for proving the +> *[A type system is a] tractable syntactic method for proving the > absence of certain program behaviors by classifying phrases -> according to the kinds of values they compute. +> according to the kinds of values they compute.* > -> - Benjamin Pierce +> — Benjamin Pierce

@@ -194,6 +197,11 @@ $$ \tau_1 \to \tau_2 \to \tau_3 \to \tau_4 \quad = \quad \tau_1 \to (\tau_2 \to (\tau_3 \to \tau_4)) $$ +In all the languages which we will implement the types present during +compilation are *erased*. Although types are possibly present in the evaluation +semantics, the runtime cannot dispatch on types of values at runtime. Types by +definition only exist at compile-time in the static semantics of the language. + Small-Step Semantics -------------------- @@ -603,18 +611,16 @@ the space of all programs and draw a large line around the universe of discourse programs that we are willing to consider, since these are the only programs that we can prove properties for. -> Well-typed programs don't go wrong, but not every program that never goes wrong +> *Well-typed programs don't go wrong, but not every program that never goes wrong > is well-typed. It's easy to exhibit programs that don't go wrong but are > ill-typed in ... any ... decidable type system. Many such programs are useful, > which is why dynamically-typed languages like Erlang and Lisp are justly -> popular. +> popular.* > -> -Simon Peyton Jones +> — Simon Peyton Jones Power always comes at a price. Using one system you can do more things. In diff --git a/005_evaluation.md b/005_evaluation.md index 273c4bc..1ce066d 100644 --- a/005_evaluation.md +++ b/005_evaluation.md @@ -1,10 +1,10 @@ +
![](img/titles/evaluation.png) +
-****** - -> Well-typed programs cannot "go wrong". +> *Well-typed programs cannot "go wrong".* > -> -Robin Milner +> — Robin Milner

@@ -22,8 +22,8 @@ lambda expression are necessarily evaluated before a lambda is reduced. A language in which the arguments are not necessarily evaluated before a lambda is reduced is non-strict. -Alternatively expressed, diverging terms are represented by the *bottom* value, -written as $\bot$. A function $f$ is non-strict if: +Alternatively expressed, diverging terms are represented up to equivalence by +the *bottom* value, written as $\bot$. A function $f$ is non-strict if: $$ f \bot \neq \bot @@ -81,9 +81,9 @@ how the subexpression ``(2 + 2)`` is evaluated to normal form before being bound. ```haskell -(λx. λy. y x) (2 + 2) λx. x + 1 -=> (λy. y 4) λx. x + 1 -=> (λy. x + 1) 4 +(\x. \y. y x) (2 + 2) λx. x + 1 +=> (\y. y 4) \x. x + 1 +=> (\y. x + 1) 4 => 4 + 1 => 5 ``` @@ -169,16 +169,15 @@ For example, the same expression we looked at for call-by-value has the same normal form but arrives at it by a different sequence of reductions: ```haskell -(λx. λy. y x) (2 + 2) λx. x + 1 -=> (λy.y (2 + 2)) λx. x + 1 -=> (λx.x + 1) (2 + 2) +(\x. \y. y x) (2 + 2) \x. x + 1 +=> (\y.y (2 + 2)) λx. x + 1 +=> (\x.x + 1) (2 + 2) => (2 + 2) + 1 => 4 + 1 => 5 ``` -Call-by-name is non-strict, although very few languages use this model, -[Frege](https://github.com/Frege/frege) being the most notable example. +Call-by-name is non-strict, although very few languages use this model. Call-by-need ------------ diff --git a/006_hindley_milner.md b/006_hindley_milner.md index c3f57d9..3a6eaa1 100644 --- a/006_hindley_milner.md +++ b/006_hindley_milner.md @@ -1,8 +1,11 @@ +
![](img/titles/hindley_milner.png) +
-****** +

+

-> There is nothing more practical than a good theory. +> *There is nothing more practical than a good theory.* > > — James C. Maxwell @@ -20,7 +23,7 @@ program give rise to a set of constraints that when solved always have a unique *principal type*. The simplest Hindley Milner type system is defined by a very short set of rules. -The first four rules describe the judgements by which we can each syntactic +The first four rules describe the judgements by which we can map each syntactic construct (``Lam``, ``App``, ``Var``, ``Let``) to their expected types. We'll elaborate on these rules shortly. @@ -100,6 +103,51 @@ let S f g x = f x (g x); As before ``let rec`` expressions will expand out in terms of the fixpoint operator and are just syntactic sugar. +Polymorphism +------------ + +We will add an additional constructs to our language that will admit a new form +of *polymorphism* for our language. Polymorphism is property of a term to +simultaneously admit several distinct types for the same function +implementation. + +For instance the polymorphic signature for the identity function maps a input of +type $\alpha$ + +$$ +\begin{aligned} +\mathtt{id}\ & ::\ \forall \alpha. \alpha \to \alpha \\ +\mathtt{id}\ & =\ \lambda x : \alpha.\ x +\end{aligned} +$$ + +Now instead of having to duplicate the functionality for every possible type +(i.e. implementing idInt, idBool, ...) we our type system admits any +instantiation that is subsumed by the polymorphic type signature. + +$$ +\begin{aligned} +& \t{id}_\t{Int} = \t{Int} \to \t{Int} \\ +& \t{id}_\t{Bool} = \t{Bool} \to \t{Bool} \\ +\end{aligned} +$$ + +A rather remarkably fact of universal quantification is that many properties +about inhabitants of a type are guaranteed by construction, these are the +so-called *free theorems*. For instance the only (nonpathological) +implementation that can inhabit a function of type ``(a, b) -> a`` is an +implementation precisely identical to that of ``fst``. + +A slightly less trivial example is that of the ``fmap`` function of type +``Functor f => (a -> b) -> f a -> f b``. The second functor law states that. + +```haskell +forall f g. fmap f . fmap g = fmap (f . g) +``` + +However it is impossible to write down a (nonpathological) function for ``fmap`` +that was well-typed and didn't have this property. We get the theorem for free! + Types ----- @@ -121,11 +169,10 @@ typeInt = TCon "Int" typeBool = TCon "Bool" ``` -However we will add an additional construct that will admit a new form of -*polymorphism* for our language. *Type schemes* model polymorphic types, they -indicate that the type variables bound in quantifier are polymorphic across the -enclosed type and can be instantiated with any type consistent with the -signature. +*Type schemes* model polymorphic types, they indicate that the type variables +bound in quantifier are polymorphic across the enclosed type and can be +instantiated with any type consistent with the signature. Intuitively the +indicate that the implementation of the function ```haskell data Scheme = Forall [TVar] Type @@ -229,8 +276,8 @@ $$ \begin{aligned} \FTV{\alpha} &= \{ \alpha \} \\ \FTV{\tau_1 \rightarrow \tau_2} &= \FTV{\tau_1} \cup \FTV{\tau_2} \\ -\FTV{\t{Int}} &= \emptyset \\ -\FTV{\t{Bool}} &= \emptyset \\ +\FTV{\t{Int}} &= \varnothing \\ +\FTV{\t{Bool}} &= \varnothing \\ \FTV{\forall x. t} &= \FTV{t} - \{ x \} \\ \end{aligned} $$ @@ -342,9 +389,9 @@ s := [n_0 / m_0, n_1 / m_1, ..., n_k / m_k] \\ $$ Two terms are said to be *unifiable* if there exists a unifying substitution set -between them. A substitution set is said to be **confluent** if the application -of substitutions is independent of the order applied, i.e. if we always arrive -at the same normal form regardless of the order of substitution chosen. +between them. A substitution set is said to be *confluent* if the application of +substitutions is independent of the order applied, i.e. if we always arrive at +the same normal form regardless of the order of substitution chosen. The notation we'll adopt for unification is, read as two types $\tau, \tau'$ are unifiable by a substitution $s$. @@ -1206,7 +1253,7 @@ Finally tab completion for our shell will use the interpreter's typing environment keys to complete on the set of locally defined variables. Repline supports prefix based tab completion where the prefix of the current command will be used to determine what to tab complete. In the case where we start with -the command ":load" we will instead tab complete on filenames in the current +the command ``:load`` we will instead tab complete on filenames in the current working directly instead. ```haskell diff --git a/007_path.md b/007_path.md index 3972825..c7734b9 100644 --- a/007_path.md +++ b/007_path.md @@ -1,6 +1,6 @@ +
![](img/titles/protohaskell.png) - -****** +
+ +

+

+ + +LLVM +==== + +LLVM is a statically typed intermediate representation and an associated +toolchain for manipulating, optimizing and converting this intermediate form +into native code. + +So for example consider a simple function which takes two arguments, adds them, +and xors the result. Writing in IR it would be formed as such: + +~~~~ {.haskell include="chapter27/example.ll"} +~~~~ + +Running this through the LLVM toolchain we can target our high level IR into +multiple different assembly codes mapping onto various architectures and CPUs +all from the same platform agnostic intermediate representation. + +**x86-64** + +```perl +test1: + .cfi_startproc + andl %edx, %esi + andl %edx, %edi + xorl %esi, %edi + movl %edi, %eax + ret +``` + +**ARM** + +```perl +test1: + and r1, r2, r1 + and r0, r2, r0 + eor r0, r0, r1 + mov pc, lr +``` + +**PowerPC** + +```perl +.L.test1: + .cfi_startproc + and 4, 5, 4 + and 3, 5, 3 + xor 3, 3, 4 + blr + .long 0 + .quad 0 +``` + +A uncommonly large amount of hardware manufacturers and software vendors (Adobe, +AMD, Apple, ARM, Google, IBM, Intel, Mozilla, Qualcomm, Samsung, Xilinx) have +come have converged on the LLVM toolchain as service agnostic way to talk about +generating machine code. + +What's even more impressive is that many of the advances in compiler +optimizations and static analysis have been mechanized in the form of +optimization passes so that all compilers written on top of the LLVM platform +can take advantage of the same advanced optimizers that would often previously +have to be developed independently. + +Types +----- + +**Primitive** + +```perl +i1 ; Boolean type +i8 ; char +i32 ; 32 bit integer +i64 ; 64 bit integer +float ; 32 bit +double ; 64 bit +``` + +**Arrays** + +```perl +[10 x float] ; Array of 10 floats +[10 x [20 x i32]] ; Array of 10 arrays of 20 integers. +``` + +**Structs** + +```perl +{float, i64} ; structure +{float, {double, i3}} ; nested structure +<{float, [2 x i3]}> ; packed structure +``` + +**Vectors** + +```perl +<4 x double> +<8 x float> +``` + +**Pointers** + +```perl +float* ; Pointer to a float +[25 x float]* ; Pointer to an array +``` + +The traditional ``void*`` pointer in C is a ``i8*`` pointer in LLVM with the +appropriate casts. + +**Constants** + +```perl +[i1 true, i1 false] ; constant bool array + ; constant vector +float 1.23421e+2 ; floating point constant +null ; null pointer constant +``` + +The ``zeroinitializer`` can be used to instantiate any type to the appropriate +zero of any type. + +```perl +<8 x float> zeroinitializer ; Zero vector +``` + +**Named Types** + +```perl +%vec4 = type <4 x i32> +%pair = type { i32, i32 } +``` + +Recursive types declarations are supported. + +```perl +%f = type { %f*, i32 } +``` + +**Platform Information** + + +```perl +target datalayout = " + e- + p : 64 : 64 : 64- + i1 : 8 : 8- + i8 : 8 : 8- + i16 : 16 : 16- + i32 : 32 : 32- + i64 : 64 : 64- + f32 : 32 : 32- + f64 : 64 : 64- + v64 : 64 : 64- + v128 : 128 : 128- + a0 : 0 : 64- + s0 : 64 : 64- + f80 : 128 : 128- + n8 : 16 : 32 : 64- + S128 + " +target triple = "x86_64-unknown-linux-gnu" +``` + +Specifications are delimited by the minus sign ``-``. + +* The ``e`` indicates the platform is little-endian. +* The ``i`` indicate the bitsize and alignment of the integer type. +* The ``f`` indicate the bitsize and alignment of the floating point type. +* The ``p`` indicate the bitsize and alignment of the pointer type. +* The ``v`` indicate the bitsize and alignment of the vector type. +* The ``a`` indicate the bitsize and alignment of the aggregate type. +* The ``n`` indicate the widths of the CPU registers. +* The ``S`` indicate the alignment of the stack. + +Variables +--------- + +Symbols used in an LLVM module are either global or local. Global symbols begin +with ``@`` and local symbols begin with ``%``. All symbols must be defined or +forward declared. + +Instructions in LLVM are either numbered sequentially (``%0``, ``%1``, ...) or +given explicit variable names (``%a``, ``%foo``, ..). For example the arguments +to the following function are named values, while the result of the add +instructions unnamed. + +```perl +define i32 @add(i32 %a, i32 %b) { + %1 = add i32 %a, %b + ret i32 %1 +} +``` + +Instructions +------------ + +```perl +%result = add i32 10, 20 +``` + +**Logical** + +* ``shl`` +* ``lshr`` +* ``ashr`` +* ``and`` +* ``or`` +* ``xor`` + +**Binary Operators** + +* ``add`` +* ``fadd`` +* ``sub`` +* ``fsub`` +* ``mul`` +* ``fmul`` +* ``udiv`` +* ``sdiv`` +* ``fdiv`` +* ``urem`` +* ``srem`` +* ``frem`` + +**Comparison** + +op unsigned signed floating +-- --------- -------- -------- +lt ULT SLT OLT +gt UGT SGT OGT +le ULE SLE OLE +ge UGE SGE OGE +eq EQ EQ OEQ +ne NE NE ONE + +```perl +%c = udiv i32 %a, %b +%d = sdiv i32 %a, %b +%e = fmul float %a, %b +%f = fdiv float %a, %b +``` + +```perl +%g = icmp eq i32 %a, %b +%i = icmp slt i32 %a, %b +%j = icmp ult i32 %a, %b +%k = fcmp olt float, %a, %b +``` + +Data +---- + +```perl +i1 1 +i32 299792458 +float 7.29735257e-3 +double 6.62606957e-34 +``` + +Blocks +------ + +Function definitions in LLVM introduce a sequence of labeled *basic blocks* +containing any number of instructions and a final *terminator* instruction which +indicates how control flow yields after the instructions of the basic block are +evaluated. + +```perl +define i1 @foo() { +entry: + br label %next +next: + br label %return +return: + ret i1 0 +} +``` + +A basic block has either zero (for entry block) or a fixed number of +*predecessors*. A graph with basic blocks as nodes and the predecessors of each +basic block as edges constitutes a *control flow graph*. LLVM's opt command can +be used to dump this graph using graphviz. + +```bash +$ opt -view-cfg module.ll +$ dot -Tpng module.dot -o module.png +``` + +We say a basic block A *dominates* a different block B in the control flow +graph if it's impossible to reach B without passing through "A, equivalently +A is the *dominator* of B. + +All logic in LLVM is written in *static single assignment* (SSA) form. Each +variable is assigned precisely once, and every variable is defined before it is +used. Updating any existing variable reference creates a new reference with for +the resulting output. + +Control Flow +------------ + +* Unconditional Branch +* Conditional Branch +* Switch +* Return +* Phi + +\clearpage + +**Return** + +The ``ret`` function simply exits the current function yielding the current +value to the virtual stack. + +```perl +define i1 @foo() { + ret i1 0 +} +``` + +![](chapter27/cfg/ret.png) + +\clearpage + +**Unconditional Branch** + +The unconditional branch ``br`` simply jumps to any basic block local to the +function. + +```perl +define i1 @foo() { + entry: + br label %next + next: + br label %return + return: + ret i1 0 +} +``` + +![](chapter27/cfg/branch.png) + +\clearpage + +**Conditional Branch** + +The conditional branch ``br`` jumps to one of two basic blocks based on whether +a test condition is ``true`` or ``false``. This corresponds the logic of a +traditional "if statement". + +```perl +define i32 @foo() { +start: + br i1 true, label %left, label %right +left: + ret i32 10 +right: + ret i32 20 +} +``` + +![](chapter27/cfg/cbranch.png) + +\clearpage + +**Switch** + +The switch statement ``switch`` jumps to any number of branches based on the +equality of value to a jump table matching values to basic blocks. + +```perl +define i32 @foo(i32 %a) { +entry: + switch i32 %a, label %default [ + i32 0, label %f + i32 1, label %g + i32 2, label %h + ] +f: + ret i32 1 +g: + ret i32 2 +h: + ret i32 3 +default: + ret i32 0 +} +``` + +![](chapter27/cfg/switch.png) + +\clearpage + +**Phi** + +A ``phi`` node selects a value based on the predecessor of the current block. + +```perl +define i32 @foo() { +start: + br i1 true, label %left, label %right +left: + %plusOne = add i32 0, 1 + br label %merge +right: + br label %merge +merge: + %join = phi i32 [ %plusOne, %left ], [ -1, %right ] + ret i32 %join +} +``` + +![](chapter27/cfg/phi.png) + +\clearpage + +**Loops** + +The traditional ``while`` and ``for`` loops can be written in terms of the +simpler conditional branching constructrs. For example in C we would write: + +```cpp +int count(int n) +{ + int i = 0; + while(i < n) + { + i++; + } + return i; +} +``` + +Whereas in LLVM we write: + +```perl +define i32 @count(i32 %n) { +entry: + br label %loop + +loop: + %i = phi i32 [ 1, %entry ], [ %nextvar, %loop ] + %nextvar = add i32 %i, 1 + + %cmptmp = icmp ult i32 %i, %n + %booltmp = zext i1 %cmptmp to i32 + %loopcond = icmp ne i32 %booltmp, 0 + + br i1 %loopcond, label %loop, label %afterloop + +afterloop: + ret i32 %i +} +``` + +![](chapter27/cfg/for.png) + +\clearpage + +**Select** + +Selects the first value if the test value is true, the second if false. + +```perl +%x = select i1 true, i8 10, i8 20 ; gives 10 +%y = select i1 false, i8 10, i8 20 ; gives 20 +``` + +Calls +----- + +* ``ccc``: The C calling convention +* ``fastcc``: The fast calling convention + +```perl +%result = call i32 @exp(i32 7) +``` + +Memory +------ + +LLVM uses the traditional load/store model: + +* ``load``: Load a typed value from a given reference +* ``store``: Store a typed value in a given reference +* ``alloca``: Allocate a pointer to memory on the virtual stack + +```perl +%ptr = alloca i32 +store i32 3, i32* %ptr +%val = load i32* %ptr +``` + +Specific pointer alignment can be specified: + +```perl +%ptr = alloca i32, align 1024 +``` + +For allocating in main memory we use an external reference to the C stdlib +memory allocator which gives us back a (``i8*``). + +```haskell +%ptr = call i8* @malloc(i32 %objectsize) +``` + +For structures: + +```perl +extractvalue {i32, float} %a, 0 ; gives i32 +extractvalue {i32, {float, double}} %a, 0, 1 ; gives double +extractvalue [2 x i32] %a, 0 ; yields i32 +``` + +```perl +%x = insertvalue {i32, float} %b, float %val, 1 ; gives {i32 1, float %b} +%y = insertvalue {i32, float} zeroinitializer, i32 1, 0 ; gives {i32 1, float 0} +``` + +GetElementPtr +------------- + +Casts +----- + +* ``trunc`` +* ``zext`` +* ``sext`` +* ``fptoui`` +* ``fptosi`` +* ``uitofp`` +* ``sitofp`` +* ``fptrunc`` +* ``fpext`` +* ``ptrtoint`` +* ``inttoptr`` +* ``bitcast`` + +```haskell +trunc i32 257 to i8 ; yields i8 1 +zext i32 257 to i64 ; yields i64 257 +sext i8 -1 to i16 ; yields i16 65535 +bitcast <2 x i32> %a to i64 ; yields i64 %a +``` + +Toolchain +--------- + +```bash +$ llc example.ll -o example.s # compile +$ lli example.ll # execute +$ opt -S example.bc -o example.ll # to assembly +$ opt example.ll -o example.bc # to bitcode +$ opt -O3 example.ll -o example.opt.ll -S # run optimizer +``` + +Individual modules can be linked together. + +```bash +$ llvm-link a.ll b.ll -o c.ll -S +``` + +Link time optimization. + +```bash +$ clang -O4 -emit-llvm a.c -c -o a.bc +$ clang -O4 -emit-llvm a.c -c -o a.bc +$ llvm-link a.bc b.bc -o all.bc +$ opt -std-compile-opts -std-link-opts -O3 all.bc -o optimized.bc +``` + +The clang project is a C compiler that targets LLVM as it's intermediate +representation. In the case where we'd like to know how some specific C +construct maps into LLVM IR we can ask clang to dump its internal IR using the +``-emit-llvm`` flag. + +```perl +# clang -emit-llvm -S add.c -o - +int add(int x) +{ + return x+1; +} +``` + +```perl +; ModuleID = 'add.c' +define i32 @add(i32 %x) nounwind uwtable { +entry: + %x.addr = alloca i32, align 4 + store i32 %x, i32* %x.addr, align 4 + %0 = load i32* %x.addr, align 4 + %add = add nsw i32 %0, 1 + ret i32 %add +} +``` + +LLVM is using a C++ API underneath the hood of all these tools. If you need to +work directly with the API it can be useful to be able to expand out the LLVM IR +into the equivalent C++ code. + +```bash +$ llc example.ll -march=cpp -o - +``` + +~~~~ {.haskell include="chapter27/example.ll"} +~~~~ + +```cpp +Function* func_test1 = mod->getFunction("test1"); +if (!func_test1) { +func_test1 = Function::Create( + /*Type=*/FuncTy_0, + /*Linkage=*/GlobalValue::ExternalLinkage, + /*Name=*/"test1", mod); +func_test1->setCallingConv(CallingConv::C); +} +AttrListPtr func_test1_PAL; +func_test1->setAttributes(func_test1_PAL); + +{ + Function::arg_iterator args = func_test1->arg_begin(); + Value* int32_x = args++; + int32_x->setName("x"); + Value* int32_y = args++; + int32_y->setName("y"); + Value* int32_z = args++; + int32_z->setName("z"); + + BasicBlock* label_1 = BasicBlock::Create(mod->getContext(), "",func_test1,0); + + BinaryOperator* int32_a = BinaryOperator::Create( + Instruction::And, int32_z, int32_x, "a", label_1); + BinaryOperator* int32_b = BinaryOperator::Create( + Instruction::And, int32_z, int32_y, "b", label_1); + BinaryOperator* int32_c = BinaryOperator::Create( + Instruction::Xor, int32_a, int32_b, "c", label_1); + ReturnInst::Create(mod->getContext(), int32_c, label_1); + +} +``` + +llvm-general +------------ + +The LLVM bindings for Haskell are split across two packages: + +* **llvm-general-pure** is a pure Haskell representation of the LLVM IR. + +* **llvm-general** is the FFI bindings to LLVM required for constructing the C representation of the + LLVM IR and performing optimization and compilation. + +llvm-general-pure does not require the LLVM libraries be available on the system. + +GHCi can have issues with the FFI and can lead to errors when working with +``llvm-general``. If you end up with errors like the following, then you are +likely trying to use ``GHCi`` or ``runhaskell`` and it is unable to link against +your LLVM library. Instead compile with standalone ``ghc``. + +```bash +Loading package llvm-general-3.3.8.2 +... linking +... ghc: /usr/lib/llvm-3.3/lib/libLLVMSupport.a: unknown symbol `_ZTVN4llvm14error_categoryE' +ghc: unable to load package `llvm-general-3.3.8.2' +```` + +Code Generation (LLVM) +====================== + +Resources +--------- + +* [LLVM Language Reference](http://llvm.org/docs/LangRef.html) +* [Implementing a JIT Compiled Language with Haskell and LLVM](http://www.stephendiehl.com/llvm/) + +\clearpage diff --git a/CONTRIBUTORS.md b/CONTRIBUTORS.md new file mode 100644 index 0000000..c8a16fe --- /dev/null +++ b/CONTRIBUTORS.md @@ -0,0 +1,14 @@ +Contributors +============ + +* Matthew Pickering +* Nick Sinopoli +* Nicolas Trangez +* Ingo Blechschmidt +* Rein Henrichs +* Ian Connolly +* Ben James +* Abe Voelker +* Paulo Tanimoto +* Brandon Williams +* Dmitry Ivanov diff --git a/Makefile b/Makefile index 395d10b..534a9ad 100644 --- a/Makefile +++ b/Makefile @@ -22,6 +22,7 @@ all: $(OBJ) top $(PANDOC) --filter ${FILTER} -f $(IFORMAT) $(FLAGS) -o $@ $< pdf: $(FILTER) + # $(PANDOC) --filter ${FILTER} -f $(IFORMAT) --template $(TEMPLATE_TEX) --latex-engine=xelatex $(FLAGS) -o WYAH.pdf title.md 0*.md contributing.md $(PANDOC) --filter ${FILTER} -f $(IFORMAT) --template $(TEMPLATE_TEX) --latex-engine=xelatex $(FLAGS) -o WYAH.pdf title.md 0*.md epub: $(FILTER) diff --git a/README.md b/README.md index 49dd7a3..c1df804 100644 --- a/README.md +++ b/README.md @@ -18,6 +18,7 @@ [![Build Status](https://travis-ci.org/sdiehl/write-you-a-haskell.svg)](https://travis-ci.org/sdiehl/write-you-a-haskell) [![Gitter](https://badges.gitter.im/Join%20Chat.svg)](https://gitter.im/sdiehl/write-you-a-haskell?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge&utm_content=body_badge) +[![MIT License](http://img.shields.io/badge/license-mit-blue.svg)](https://github.com/sdiehl/write-you-a-haskell/blob/master/LICENSE) Read Online: diff --git a/chapter10/.gitkeep b/chapter10/.gitkeep new file mode 100644 index 0000000..e69de29 diff --git a/chapter10/generics.hs b/chapter10/generics.hs new file mode 100644 index 0000000..3a84874 --- /dev/null +++ b/chapter10/generics.hs @@ -0,0 +1,36 @@ +{-# LANGUAGE TypeOperators #-} +{-# LANGUAGE FlexibleContexts #-} +{-# LANGUAGE DefaultSignatures #-} + +import GHC.Generics + +-- Auxiliary class +class GEq' f where + geq' :: f a -> f a -> Bool + +instance GEq' U1 where + geq' _ _ = True + +instance (GEq c) => GEq' (K1 i c) where + geq' (K1 a) (K1 b) = geq a b + +instance (GEq' a) => GEq' (M1 i c a) where + geq' (M1 a) (M1 b) = geq' a b + +instance (GEq' a, GEq' b) => GEq' (a :+: b) where + geq' (L1 a) (L1 b) = geq' a b + geq' (R1 a) (R1 b) = geq' a b + geq' _ _ = False + +instance (GEq' a, GEq' b) => GEq' (a :*: b) where + geq' (a1 :*: b1) (a2 :*: b2) = geq' a1 a2 && geq' b1 b2 + +-- +class GEq a where + geq :: a -> a -> Bool + default geq :: (Generic a, GEq' (Rep a)) => a -> a -> Bool + geq x y = geq' (from x) (from y) + +instance GEq Char where geq = (==) +instance GEq Int where geq = (==) +instance GEq Float where geq = (==) diff --git a/chapter11/.gitkeep b/chapter11/.gitkeep new file mode 100644 index 0000000..e69de29 diff --git a/chapter12/.gitkeep b/chapter12/.gitkeep new file mode 100644 index 0000000..e69de29 diff --git a/chapter12/systemf/Syntax.hs b/chapter12/systemf/Syntax.hs new file mode 100644 index 0000000..0636306 --- /dev/null +++ b/chapter12/systemf/Syntax.hs @@ -0,0 +1,24 @@ +type Name = String +type TypeVar = String +type TypeCon = String + +data Expr + = Lam Type Name Expr -- \x -> a + | Var Name -- x + | App Expr Expr -- a b + | TLam Name Expr -- /\ a . b + | TApp Expr Type -- a [ b ] + | Lit Literal -- 1 + | Let Name Expr Expr -- let x = v in a + +data Type + = TForall [Name] Type + | TArr Type Type + | TCon TypeCon + | TVar TypeVar + deriving (Show) + +data Literal + = LitInt Integer + | LitChar Char + deriving (Eq, Ord, Show) diff --git a/chapter27/cfg/branch.dot b/chapter27/cfg/branch.dot new file mode 100644 index 0000000..425c0e8 --- /dev/null +++ b/chapter27/cfg/branch.dot @@ -0,0 +1,9 @@ +digraph "CFG for 'foo' function" { + graph [ dpi = 72 ]; + label="Unconditional Branch"; + Node0x103c3c0 [shape=record,label="{start:\l br label %next\l}"]; + Node0x103c3c0 -> Node0x1038a90; + Node0x1038a90 [shape=record,label="{next: \l br label %return\l}"]; + Node0x1038a90 -> Node0x1038b90; + Node0x1038b90 [shape=record,label="{return: \l ret i1 false\l}"]; +} diff --git a/chapter27/cfg/branch.ll b/chapter27/cfg/branch.ll new file mode 100644 index 0000000..b881704 --- /dev/null +++ b/chapter27/cfg/branch.ll @@ -0,0 +1,9 @@ +define i1 @foo() { +start: + br label %next +next: + br label %return +return: + ret i1 0 +} + diff --git a/chapter27/cfg/branch.png b/chapter27/cfg/branch.png new file mode 100644 index 0000000..8fef29f Binary files /dev/null and b/chapter27/cfg/branch.png differ diff --git a/chapter27/cfg/cbranch.dot b/chapter27/cfg/cbranch.dot new file mode 100644 index 0000000..a386fe7 --- /dev/null +++ b/chapter27/cfg/cbranch.dot @@ -0,0 +1,10 @@ +digraph "CFG for 'foo' function" { + graph [ dpi = 72 ]; + label="Conditional Branch"; + + Node0x1f793c0 [shape=record,label="{start:\l br i1 true, label %left, label %right\l|{T|F}}"]; + Node0x1f793c0:s0 -> Node0x1f75a20; + Node0x1f793c0:s1 -> Node0x1f75b40; + Node0x1f75a20 [shape=record,label="{left: \l ret i32 10\l}"]; + Node0x1f75b40 [shape=record,label="{right: \l ret i32 20\l}"]; +} diff --git a/chapter27/cfg/cbranch.ll b/chapter27/cfg/cbranch.ll new file mode 100644 index 0000000..eb96bb3 --- /dev/null +++ b/chapter27/cfg/cbranch.ll @@ -0,0 +1,8 @@ +define i32 @foo() { +start: + br i1 true, label %left, label %right +left: + ret i32 10 +right: + ret i32 20 +} diff --git a/chapter27/cfg/cbranch.png b/chapter27/cfg/cbranch.png new file mode 100644 index 0000000..625b8a1 Binary files /dev/null and b/chapter27/cfg/cbranch.png differ diff --git a/chapter27/cfg/for.dot b/chapter27/cfg/for.dot new file mode 100644 index 0000000..123dcf6 --- /dev/null +++ b/chapter27/cfg/for.dot @@ -0,0 +1,11 @@ +digraph "CFG for 'printstar' function" { + graph [ dpi = 72 ]; + label="For loop"; + + Node0x11ea390 [shape=record,label="{entry:\l br label %loop\l}"]; + Node0x11ea390 -> Node0x11e72d0; + Node0x11e72d0 [shape=record,label="{loop: \l %i = phi i32 [ 1, %entry ], [ %nextvar, %loop ]\l %nextvar = add i32 %i, 1\l %cmptmp = icmp ult i32 %i, %n\l %booltmp = zext i1 %cmptmp to i32\l %loopcond = icmp ne i32 %booltmp, 0\l br i1 %loopcond, label %loop, label %afterloop\l|{T|F}}"]; + Node0x11e72d0:s0 -> Node0x11e72d0; + Node0x11e72d0:s1 -> Node0x11e7540; + Node0x11e7540 [shape=record,label="{afterloop: \l ret i32 %i\l}"]; +} diff --git a/chapter27/cfg/for.ll b/chapter27/cfg/for.ll new file mode 100644 index 0000000..f34f1a3 --- /dev/null +++ b/chapter27/cfg/for.ll @@ -0,0 +1,17 @@ +define i32 @printstar(i32 %n) { +entry: + br label %loop + +loop: + %i = phi i32 [ 1, %entry ], [ %nextvar, %loop ] + %nextvar = add i32 %i, 1 + + %cmptmp = icmp ult i32 %i, %n + %booltmp = zext i1 %cmptmp to i32 + %loopcond = icmp ne i32 %booltmp, 0 + + br i1 %loopcond, label %loop, label %afterloop + +afterloop: + ret i32 %i +} diff --git a/chapter27/cfg/for.png b/chapter27/cfg/for.png new file mode 100644 index 0000000..e24626a Binary files /dev/null and b/chapter27/cfg/for.png differ diff --git a/chapter27/cfg/phi.dot b/chapter27/cfg/phi.dot new file mode 100644 index 0000000..0fefad3 --- /dev/null +++ b/chapter27/cfg/phi.dot @@ -0,0 +1,14 @@ +digraph "CFG for 'foo' function" { + graph [ dpi = 72 ]; + + label="Phi node"; + + Node0x175c3c0 [shape=record,label="{start:\l br i1 true, label %left, label %right\l|{T|F}}"]; + Node0x175c3c0:s0 -> Node0x1758a20; + Node0x175c3c0:s1 -> Node0x1758b40; + Node0x1758a20 [shape=record,label="{left: \l %plusOne = add i32 0, 1\l br label %merge\l}"]; + Node0x1758a20 -> Node0x17593e0; + Node0x1758b40 [shape=record,label="{right: \l br label %merge\l}"]; + Node0x1758b40 -> Node0x17593e0; + Node0x17593e0 [shape=record,label="{merge: \l %join = phi i32 [ %plusOne, %left ], [ -1, %right ]\l ret i32 %join\l}"]; +} diff --git a/chapter27/cfg/phi.ll b/chapter27/cfg/phi.ll new file mode 100644 index 0000000..92666e6 --- /dev/null +++ b/chapter27/cfg/phi.ll @@ -0,0 +1,12 @@ +define i32 @foo() { +start: + br i1 true, label %left, label %right +left: + %plusOne = add i32 0, 1 + br label %merge +right: + br label %merge +merge: + %join = phi i32 [ %plusOne, %left], [ -1, %right] + ret i32 %join +} diff --git a/chapter27/cfg/phi.png b/chapter27/cfg/phi.png new file mode 100644 index 0000000..ee0b6a0 Binary files /dev/null and b/chapter27/cfg/phi.png differ diff --git a/chapter27/cfg/ret.dot b/chapter27/cfg/ret.dot new file mode 100644 index 0000000..8fdc2a5 --- /dev/null +++ b/chapter27/cfg/ret.dot @@ -0,0 +1,6 @@ +digraph "CFG for 'foo' function" { + graph [ dpi = 72 ]; + + label="Return"; + Node0x259a2b0 [shape=record,label="{%0:\l ret i1 false\l}"]; +} diff --git a/chapter27/cfg/ret.ll b/chapter27/cfg/ret.ll new file mode 100644 index 0000000..fcb04a8 --- /dev/null +++ b/chapter27/cfg/ret.ll @@ -0,0 +1,4 @@ +define i1 @foo() { + ret i1 0 +} + diff --git a/chapter27/cfg/ret.png b/chapter27/cfg/ret.png new file mode 100644 index 0000000..c29142c Binary files /dev/null and b/chapter27/cfg/ret.png differ diff --git a/chapter27/cfg/switch.dot b/chapter27/cfg/switch.dot new file mode 100644 index 0000000..72474bd --- /dev/null +++ b/chapter27/cfg/switch.dot @@ -0,0 +1,14 @@ +digraph "CFG for 'foo' function" { + graph [ dpi = 72 ]; + label="Switch"; + + Node0x1027390 [shape=record,label="{entry:\l switch i32 %a, label %default [\l i32 0, label %f\l i32 1, label %g\l i32 2, label %h\l ]\l|{def|0|1|2}}"]; + Node0x1027390:s0 -> Node0x1024700; + Node0x1027390:s1 -> Node0x1024300; + Node0x1027390:s2 -> Node0x1024440; + Node0x1027390:s3 -> Node0x1023920; + Node0x1024300 [shape=record,label="{f: \l ret i32 1\l}"]; + Node0x1024440 [shape=record,label="{g: \l ret i32 2\l}"]; + Node0x1023920 [shape=record,label="{h: \l ret i32 3\l}"]; + Node0x1024700 [shape=record,label="{default: \l ret i32 0\l}"]; +} diff --git a/chapter27/cfg/switch.ll b/chapter27/cfg/switch.ll new file mode 100644 index 0000000..2f91ab8 --- /dev/null +++ b/chapter27/cfg/switch.ll @@ -0,0 +1,14 @@ +define i32 @foo(i32 %a) { +entry: + switch i32 %a, label %default [ i32 0, label %f + i32 1, label %g + i32 2, label %h ] +f: + ret i32 1 +g: + ret i32 2 +h: + ret i32 3 +default: + ret i32 0 +} diff --git a/chapter27/cfg/switch.png b/chapter27/cfg/switch.png new file mode 100644 index 0000000..b53ed6c Binary files /dev/null and b/chapter27/cfg/switch.png differ diff --git a/chapter27/dsl/Codegen.hs b/chapter27/dsl/Codegen.hs new file mode 100644 index 0000000..3874c96 --- /dev/null +++ b/chapter27/dsl/Codegen.hs @@ -0,0 +1,281 @@ +{-# LANGUAGE OverloadedStrings #-} +{-# LANGUAGE GeneralizedNewtypeDeriving #-} + +module Codegen where + +import Data.Word +import Data.String +import Data.List +import Data.Function +import qualified Data.Map as Map + +import Control.Monad.State +import Control.Applicative + +import LLVM.General.AST +import LLVM.General.AST.Global +import qualified LLVM.General.AST as AST + +import qualified LLVM.General.AST.Constant as C +import qualified LLVM.General.AST.Attribute as A +import qualified LLVM.General.AST.CallingConvention as CC +import qualified LLVM.General.AST.FloatingPointPredicate as FP + +------------------------------------------------------------------------------- +-- Module Level +------------------------------------------------------------------------------- + +newtype LLVM a = LLVM { unLLVM :: State AST.Module a } + deriving (Functor, Applicative, Monad, MonadState AST.Module ) + +runLLVM :: AST.Module -> LLVM a -> AST.Module +runLLVM = flip (execState . unLLVM) + +emptyModule :: String -> AST.Module +emptyModule label = defaultModule { moduleName = label } + +addDefn :: Definition -> LLVM () +addDefn d = do + defs <- gets moduleDefinitions + modify $ \s -> s { moduleDefinitions = defs ++ [d] } + +define :: Type -> String -> [(Type, Name)] -> Codegen a -> LLVM () +define retty label argtys body = addDefn $ + GlobalDefinition $ functionDefaults { + name = Name label + , parameters = ([Parameter ty nm [] | (ty, nm) <- argtys], False) + , returnType = retty + , basicBlocks = bls + } + where + bls = createBlocks $ execCodegen $ do + enter <- addBlock entryBlockName + void $ setBlock enter + body + +external :: Type -> String -> [(Type, Name)] -> LLVM () +external retty label argtys = addDefn $ + GlobalDefinition $ functionDefaults { + name = Name label + , parameters = ([Parameter ty nm [] | (ty, nm) <- argtys], False) + , returnType = retty + , basicBlocks = [] + } + +--------------------------------------------------------------------------------- +-- Types +------------------------------------------------------------------------------- + +-- IEEE 754 double +double :: Type +double = FloatingPointType 64 IEEE + +------------------------------------------------------------------------------- +-- Names +------------------------------------------------------------------------------- + +type Names = Map.Map String Int + +uniqueName :: String -> Names -> (String, Names) +uniqueName nm ns = + case Map.lookup nm ns of + Nothing -> (nm, Map.insert nm 1 ns) + Just ix -> (nm ++ show ix, Map.insert nm (ix+1) ns) + +instance IsString Name where + fromString = Name . fromString + +------------------------------------------------------------------------------- +-- Codegen State +------------------------------------------------------------------------------- + +type SymbolTable = [(String, Operand)] + +data CodegenState + = CodegenState { + currentBlock :: Name -- Name of the active block to append to + , blocks :: Map.Map Name BlockState -- Blocks for function + , symtab :: SymbolTable -- Function scope symbol table + , blockCount :: Int -- Count of basic blocks + , count :: Word -- Count of unnamed instructions + , names :: Names -- Name Supply + } deriving Show + +data BlockState + = BlockState { + idx :: Int -- Block index + , stack :: [Named Instruction] -- Stack of instructions + , term :: Maybe (Named Terminator) -- Block terminator + } deriving Show + +------------------------------------------------------------------------------- +-- Codegen Operations +------------------------------------------------------------------------------- + +newtype Codegen a = Codegen { runCodegen :: State CodegenState a } + deriving (Functor, Applicative, Monad, MonadState CodegenState ) + +sortBlocks :: [(Name, BlockState)] -> [(Name, BlockState)] +sortBlocks = sortBy (compare `on` (idx . snd)) + +createBlocks :: CodegenState -> [BasicBlock] +createBlocks m = map makeBlock $ sortBlocks $ Map.toList (blocks m) + +makeBlock :: (Name, BlockState) -> BasicBlock +makeBlock (l, (BlockState _ s t)) = BasicBlock l s (maketerm t) + where + maketerm (Just x) = x + maketerm Nothing = error $ "Block has no terminator: " ++ (show l) + +entryBlockName :: String +entryBlockName = "entry" + +emptyBlock :: Int -> BlockState +emptyBlock i = BlockState i [] Nothing + +emptyCodegen :: CodegenState +emptyCodegen = CodegenState (Name entryBlockName) Map.empty [] 1 0 Map.empty + +execCodegen :: Codegen a -> CodegenState +execCodegen m = execState (runCodegen m) emptyCodegen + +fresh :: Codegen Word +fresh = do + i <- gets count + modify $ \s -> s { count = 1 + i } + return $ i + 1 + +instr :: Instruction -> Codegen (Operand) +instr ins = do + n <- fresh + let ref = (UnName n) + blk <- current + let i = stack blk + modifyBlock (blk { stack = i ++ [ref := ins] } ) + return $ local ref + +terminator :: Named Terminator -> Codegen (Named Terminator) +terminator trm = do + blk <- current + modifyBlock (blk { term = Just trm }) + return trm + +------------------------------------------------------------------------------- +-- Block Stack +------------------------------------------------------------------------------- + +entry :: Codegen Name +entry = gets currentBlock + +addBlock :: String -> Codegen Name +addBlock bname = do + bls <- gets blocks + ix <- gets blockCount + nms <- gets names + let new = emptyBlock ix + (qname, supply) = uniqueName bname nms + modify $ \s -> s { blocks = Map.insert (Name qname) new bls + , blockCount = ix + 1 + , names = supply + } + return (Name qname) + +setBlock :: Name -> Codegen Name +setBlock bname = do + modify $ \s -> s { currentBlock = bname } + return bname + +getBlock :: Codegen Name +getBlock = gets currentBlock + +modifyBlock :: BlockState -> Codegen () +modifyBlock new = do + active <- gets currentBlock + modify $ \s -> s { blocks = Map.insert active new (blocks s) } + +current :: Codegen BlockState +current = do + c <- gets currentBlock + blks <- gets blocks + case Map.lookup c blks of + Just x -> return x + Nothing -> error $ "No such block: " ++ show c + +------------------------------------------------------------------------------- +-- Symbol Table +------------------------------------------------------------------------------- + +assign :: String -> Operand -> Codegen () +assign var x = do + lcls <- gets symtab + modify $ \s -> s { symtab = [(var, x)] ++ lcls } + +getvar :: String -> Codegen Operand +getvar var = do + syms <- gets symtab + case lookup var syms of + Just x -> return x + Nothing -> error $ "Local variable not in scope: " ++ show var + +------------------------------------------------------------------------------- + +-- References +local :: Name -> Operand +local = LocalReference + +global :: Name -> C.Constant +global = C.GlobalReference + +externf :: Name -> Operand +externf = ConstantOperand . C.GlobalReference + +-- Arithmetic and Constants +fadd :: Operand -> Operand -> Codegen Operand +fadd a b = instr $ FAdd a b [] + +fsub :: Operand -> Operand -> Codegen Operand +fsub a b = instr $ FSub a b [] + +fmul :: Operand -> Operand -> Codegen Operand +fmul a b = instr $ FMul a b [] + +fdiv :: Operand -> Operand -> Codegen Operand +fdiv a b = instr $ FDiv a b [] + +fcmp :: FP.FloatingPointPredicate -> Operand -> Operand -> Codegen Operand +fcmp cond a b = instr $ FCmp cond a b [] + +cons :: C.Constant -> Operand +cons = ConstantOperand + +uitofp :: Type -> Operand -> Codegen Operand +uitofp ty a = instr $ UIToFP a ty [] + +toArgs :: [Operand] -> [(Operand, [A.ParameterAttribute])] +toArgs = map (\x -> (x, [])) + +-- Effects +call :: Operand -> [Operand] -> Codegen Operand +call fn args = instr $ Call False CC.C [] (Right fn) (toArgs args) [] [] + +alloca :: Type -> Codegen Operand +alloca ty = instr $ Alloca ty Nothing 0 [] + +store :: Operand -> Operand -> Codegen Operand +store ptr val = instr $ Store False ptr val Nothing 0 [] + +load :: Operand -> Codegen Operand +load ptr = instr $ Load False ptr Nothing 0 [] + +-- Control Flow +br :: Name -> Codegen (Named Terminator) +br val = terminator $ Do $ Br val [] + +cbr :: Operand -> Name -> Name -> Codegen (Named Terminator) +cbr cond tr fl = terminator $ Do $ CondBr cond tr fl [] + +phi :: Type -> [(Operand, Name)] -> Codegen Operand +phi ty incoming = instr $ Phi ty incoming [] + +ret :: Operand -> Codegen (Named Terminator) +ret val = terminator $ Do $ Ret (Just val) [] diff --git a/chapter27/dsl/JIT.hs b/chapter27/dsl/JIT.hs new file mode 100644 index 0000000..7b5a51a --- /dev/null +++ b/chapter27/dsl/JIT.hs @@ -0,0 +1,58 @@ +module JIT where + +import Data.Int +import Data.Word +import Foreign.Ptr ( FunPtr, castFunPtr ) + +import Control.Monad.Error + +import LLVM.General.Target +import LLVM.General.Context +import LLVM.General.CodeModel +import LLVM.General.Module as Mod +import qualified LLVM.General.AST as AST + +import LLVM.General.PassManager +import LLVM.General.Transforms +import LLVM.General.Analysis + +import qualified LLVM.General.ExecutionEngine as EE + +foreign import ccall "dynamic" haskFun :: FunPtr (IO Double) -> (IO Double) + +run :: FunPtr a -> IO Double +run fn = haskFun (castFunPtr fn :: FunPtr (IO Double)) + +jit :: Context -> (EE.MCJIT -> IO a) -> IO a +jit c = EE.withMCJIT c optlevel model ptrelim fastins + where + optlevel = Just 0 -- optimization level + model = Nothing -- code model ( Default ) + ptrelim = Nothing -- frame pointer elimination + fastins = Nothing -- fast instruction selection + +passes :: PassSetSpec +passes = defaultCuratedPassSetSpec { optLevel = Just 3 } + +runJIT :: AST.Module -> IO (Either String AST.Module) +runJIT mod = do + withContext $ \context -> + jit context $ \executionEngine -> + runErrorT $ withModuleFromAST context mod $ \m -> + withPassManager passes $ \pm -> do + -- Optimization Pass + {-runPassManager pm m-} + optmod <- moduleAST m + s <- moduleLLVMAssembly m + putStrLn s + + EE.withModuleInEngine executionEngine m $ \ee -> do + mainfn <- EE.getFunction ee (AST.Name "main") + case mainfn of + Just fn -> do + res <- run fn + putStrLn $ "Evaluated to: " ++ show res + Nothing -> return () + + -- Return the optimized module + return optmod diff --git a/chapter27/dsl/Main.hs b/chapter27/dsl/Main.hs new file mode 100644 index 0000000..81ad7d3 --- /dev/null +++ b/chapter27/dsl/Main.hs @@ -0,0 +1,32 @@ +import JIT +import Codegen +import qualified LLVM.General.AST as AST +import qualified LLVM.General.AST.Float as F +import qualified LLVM.General.AST.Constant as C + +{- + +; ModuleID = 'my cool jit' + +define double @main() { +entry: + %1 = fadd double 1.000000e+01, 2.000000e+01 + ret double %1 +} + +-} + +initModule :: AST.Module +initModule = emptyModule "my cool jit" + +logic = do + define double "main" [] $ do + let a = cons $ C.Float (F.Double 10) + let b = cons $ C.Float (F.Double 20) + res <- fadd a b + ret res + +main = do + let ast = runLLVM initModule logic + runJIT ast + return ast diff --git a/chapter27/dsl/dsl.cabal b/chapter27/dsl/dsl.cabal new file mode 100644 index 0000000..d6632e6 --- /dev/null +++ b/chapter27/dsl/dsl.cabal @@ -0,0 +1,22 @@ +name: dsl +version: 0.1.0.0 +--synopsis: +--description: +license: MIT +license-file: LICENSE-MIT +author: Stephen Diehl +maintainer: stephen.m.diehl@gmail.com +copyright: 2015 Stephen Diehl +Category: Compilers +build-type: Simple +cabal-version: >=1.10 + +library + default-language: Haskell2010 + build-depends: + base >= 4.6 + , haskeline >= 0.7.1.2 + , llvm-general == 3.4.4.* + , llvm-general-pure == 3.4.4.* + , mtl >= 2.2 + , transformers diff --git a/chapter27/example.cpp b/chapter27/example.cpp new file mode 100644 index 0000000..cb2dd08 --- /dev/null +++ b/chapter27/example.cpp @@ -0,0 +1,93 @@ +// Generated by llvm2cpp - DO NOT MODIFY! + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +using namespace llvm; + +Module* makeLLVMModule(); + +int main(int argc, char**argv) { + Module* Mod = makeLLVMModule(); + verifyModule(*Mod, PrintMessageAction); + PassManager PM; + PM.add(createPrintModulePass(&outs())); + PM.run(*Mod); + return 0; +} + + +Module* makeLLVMModule() { + // Module Construction + Module* mod = new Module("example.ll", getGlobalContext()); + + // Type Definitions + std::vectorFuncTy_0_args; + FuncTy_0_args.push_back(IntegerType::get(mod->getContext(), 32)); + FuncTy_0_args.push_back(IntegerType::get(mod->getContext(), 32)); + FuncTy_0_args.push_back(IntegerType::get(mod->getContext(), 32)); + FunctionType* FuncTy_0 = FunctionType::get( + /*Result=*/IntegerType::get(mod->getContext(), 32), + /*Params=*/FuncTy_0_args, + /*isVarArg=*/false); + + + // Function Declarations + + Function* func_test1 = mod->getFunction("test1"); + if (!func_test1) { + func_test1 = Function::Create( + /*Type=*/FuncTy_0, + /*Linkage=*/GlobalValue::ExternalLinkage, + /*Name=*/"test1", mod); + func_test1->setCallingConv(CallingConv::C); + } + AttrListPtr func_test1_PAL; + func_test1->setAttributes(func_test1_PAL); + + // Global Variable Declarations + + + // Constant Definitions + + // Global Variable Definitions + + // Function Definitions + + // Function: test1 (func_test1) + { + Function::arg_iterator args = func_test1->arg_begin(); + Value* int32_x = args++; + int32_x->setName("x"); + Value* int32_y = args++; + int32_y->setName("y"); + Value* int32_z = args++; + int32_z->setName("z"); + + BasicBlock* label_1 = BasicBlock::Create(mod->getContext(), "",func_test1,0); + + // Block (label_1) + BinaryOperator* int32_a = BinaryOperator::Create(Instruction::And, int32_z, int32_x, "a", label_1); + BinaryOperator* int32_b = BinaryOperator::Create(Instruction::And, int32_z, int32_y, "b", label_1); + BinaryOperator* int32_c = BinaryOperator::Create(Instruction::Xor, int32_a, int32_b, "c", label_1); + ReturnInst::Create(mod->getContext(), int32_c, label_1); + + } + + return mod; +} diff --git a/chapter27/example.ll b/chapter27/example.ll new file mode 100644 index 0000000..4290f67 --- /dev/null +++ b/chapter27/example.ll @@ -0,0 +1,6 @@ +define i32 @test1(i32 %x, i32 %y, i32 %z) { + %a = and i32 %z, %x + %b = and i32 %z, %y + %c = xor i32 %a, %b + ret i32 %c +} diff --git a/chapter5/stlc/Main.hs b/chapter5/stlc/Main.hs index 7af67ae..fa2e0e6 100644 --- a/chapter5/stlc/Main.hs +++ b/chapter5/stlc/Main.hs @@ -22,7 +22,7 @@ main :: IO () main = runInputT defaultSettings loop where loop = do - minput <- getInputLine "Stlc> " + minput <- getInputLine "Happy> " case minput of Nothing -> outputStrLn "Goodbye." Just input -> (liftIO $ process input) >> loop diff --git a/chapter9/assign/.gitignore b/chapter9/assign/.gitignore new file mode 100644 index 0000000..1b8952b --- /dev/null +++ b/chapter9/assign/.gitignore @@ -0,0 +1,3 @@ +Lexer.hs +Parser.hs +Main diff --git a/chapter9/assign/Eval.hs b/chapter9/assign/Eval.hs new file mode 100644 index 0000000..d33b233 --- /dev/null +++ b/chapter9/assign/Eval.hs @@ -0,0 +1,34 @@ +module Eval (eval) where + +import Syntax +import Control.Monad.State +import qualified Data.Map as Map + +data Value + = VInt Int + | VUnit + +instance Show Value where + show (VInt x) = show x + +type Eval = StateT Env IO +type Env = [(String, Value)] + +eval1 :: Expr -> Eval Value +eval1 expr = case expr of + Num a -> return (VInt a) + Var a -> do + env <- get + case lookup a env of + Just val -> return val + Nothing -> error "Not in scope" + Print a -> do + a' <- eval1 a + liftIO $ print a' + return VUnit + Assign ref val -> do + modify $ \s -> (ref, VInt val) : s + return VUnit + +eval :: [Expr] -> IO () +eval xs = evalStateT (mapM_ eval1 xs) [] diff --git a/chapter9/assign/Lexer.x b/chapter9/assign/Lexer.x new file mode 100644 index 0000000..0337f3e --- /dev/null +++ b/chapter9/assign/Lexer.x @@ -0,0 +1,38 @@ +{ +module Lexer ( + Token(..), + scanTokens +) where + +import Syntax +} + +%wrapper "basic" + +$digit = 0-9 +$alpha = [a-zA-Z] +$eol = [\n] + +tokens :- + + -- Whitespace insensitive + $eol ; + $white+ ; + print { \s -> TokenPrint } + $digit+ { \s -> TokenNum (read s) } + \= { \s -> TokenEq } + $alpha [$alpha $digit \_ \']* { \s -> TokenSym s } + +{ + +data Token + = TokenNum Int + | TokenSym String + | TokenPrint + | TokenEq + | TokenEOF + deriving (Eq,Show) + +scanTokens = alexScanTokens + +} diff --git a/chapter9/assign/Main.hs b/chapter9/assign/Main.hs new file mode 100644 index 0000000..a7a498a --- /dev/null +++ b/chapter9/assign/Main.hs @@ -0,0 +1,21 @@ +import Eval (eval) +import Parser (parseExpr) +import System.Environment + +process :: String -> IO () +process input = do + let ast = parseExpr input + case ast of + Right ast -> eval ast + Left err -> do + putStrLn "Parser Error:" + print err + +main :: IO () +main = do + args <- getArgs + case args of + [] -> putStrLn "Usage: assign " + [fname] -> do + contents <- readFile fname + process contents diff --git a/chapter9/assign/Makefile b/chapter9/assign/Makefile new file mode 100644 index 0000000..83ab200 --- /dev/null +++ b/chapter9/assign/Makefile @@ -0,0 +1,6 @@ +all: + alex Lexer.x + happy Parser.y + ghc --make Main -o Main +clean: + rm -f *.o *.hi Parser.hs Lexer.hs Main diff --git a/chapter9/assign/Parser.y b/chapter9/assign/Parser.y new file mode 100644 index 0000000..d39965c --- /dev/null +++ b/chapter9/assign/Parser.y @@ -0,0 +1,46 @@ +{ +{-# LANGUAGE GeneralizedNewtypeDeriving #-} + +module Parser ( + parseExpr, +) where + +import Lexer +import Syntax + +import Control.Monad.Except +} + +%name expr +%tokentype { Token } +%monad { Except String } { (>>=) } { return } +%error { parseError } + +%token + int { TokenNum $$ } + var { TokenSym $$ } + print { TokenPrint } + '=' { TokenEq } + +%% + +terms + : term { [$1] } + | term terms { $1 : $2 } + +term + : var { Var $1 } + | var '=' int { Assign $1 $3 } + | print term { Print $2 } + +{ + +parseError :: [Token] -> Except String a +parseError (l:ls) = throwError (show l) +parseError [] = throwError "Unexpected end of Input" + +parseExpr :: String -> Either String [Expr] +parseExpr input = + let tokenStream = scanTokens input in + runExcept (expr tokenStream) +} diff --git a/chapter9/assign/Syntax.hs b/chapter9/assign/Syntax.hs new file mode 100644 index 0000000..9881101 --- /dev/null +++ b/chapter9/assign/Syntax.hs @@ -0,0 +1,8 @@ +module Syntax where + +data Expr + = Var String + | Num Int + | Print Expr + | Assign String Int + deriving (Eq,Show) diff --git a/chapter9/assign/happy.cabal b/chapter9/assign/happy.cabal new file mode 100644 index 0000000..ed65d8d --- /dev/null +++ b/chapter9/assign/happy.cabal @@ -0,0 +1,19 @@ +name: assign +version: 0.1.0.0 +build-type: Simple +extra-source-files: README.md +cabal-version: >=1.10 + +executable assign + build-depends: + base >= 4.6 && <4.7 + , containers >= 0.5 && <0.6 + , mtl >= 2.2 + default-language: Haskell2010 + main-is: Main.hs + + Build-depends: base, array + build-tools: alex, happy + other-modules: + Parser, + Lexer diff --git a/chapter9/assign/input.test b/chapter9/assign/input.test new file mode 100644 index 0000000..2b1a3ca --- /dev/null +++ b/chapter9/assign/input.test @@ -0,0 +1,6 @@ +x = 4 +print x +y = 5 +print y +y = 6 +print y diff --git a/chapter9/happy/.gitignore b/chapter9/happy/.gitignore new file mode 100644 index 0000000..1b8952b --- /dev/null +++ b/chapter9/happy/.gitignore @@ -0,0 +1,3 @@ +Lexer.hs +Parser.hs +Main diff --git a/chapter9/happy/Eval.hs b/chapter9/happy/Eval.hs new file mode 100644 index 0000000..1cb4615 --- /dev/null +++ b/chapter9/happy/Eval.hs @@ -0,0 +1,52 @@ +module Eval where +import Syntax + +import Control.Monad.Identity +import qualified Data.Map as Map + +data Value + = VInt Integer + | VBool Bool + | VClosure String Expr (Eval.Scope) + +instance Show Value where + show (VInt x) = show x + show (VBool x) = show x + show VClosure{} = "<>" + +type Evaluate t = Identity t +type Scope = Map.Map String Value + +eval :: Eval.Scope -> Expr -> Identity Value +eval env expr = case expr of + Lit (LInt x) -> return $ VInt (fromIntegral x) + Lit (LBool x) -> return $ VBool x + Var x -> return $ env Map.! x + Lam x body -> return (VClosure x body env) + App a b -> do + x <- eval env a + y <- eval env b + apply x y + Op op a b -> do + x <- eval env a + y <- eval env b + return $ binop op x y + +binop :: Binop -> Value -> Value -> Value +binop Add (VInt a) (VInt b) = VInt (a+b) +binop Sub (VInt a) (VInt b) = VInt (a-b) +binop Mul (VInt a) (VInt b) = VInt (a*b) +binop Eql (VInt a) (VInt b) = VBool (a==b) + +extend :: Scope -> String -> Value -> Scope +extend env v t = Map.insert v t env + +apply :: Value -> Value -> Evaluate Value +apply (VClosure v t0 e) t1 = eval (extend e v t1) t0 +apply _ _ = error "Tried to apply closure" + +emptyScope :: Scope +emptyScope = Map.empty + +runEval :: Expr -> Value +runEval x = runIdentity (eval emptyScope x) diff --git a/chapter9/happy/Lexer.x b/chapter9/happy/Lexer.x new file mode 100644 index 0000000..8414bc3 --- /dev/null +++ b/chapter9/happy/Lexer.x @@ -0,0 +1,63 @@ +{ +module Lexer ( + Token(..), + scanTokens +) where + +import Syntax +} + +%wrapper "basic" + +$digit = 0-9 +$alpha = [a-zA-Z] +$eol = [\n] + +tokens :- + + -- Whitespace insensitive + $eol ; + $white+ ; + + -- Comments + "#".* ; + + -- Syntax + let { \s -> TokenLet } + True { \s -> TokenTrue } + False { \s -> TokenFalse } + in { \s -> TokenIn } + $digit+ { \s -> TokenNum (read s) } + "->" { \s -> TokenArrow } + \= { \s -> TokenEq } + \\ { \s -> TokenLambda } + [\+] { \s -> TokenAdd } + [\-] { \s -> TokenSub } + [\*] { \s -> TokenMul } + \( { \s -> TokenLParen } + \) { \s -> TokenRParen } + $alpha [$alpha $digit \_ \']* { \s -> TokenSym s } + +{ + +data Token + = TokenLet + | TokenTrue + | TokenFalse + | TokenIn + | TokenLambda + | TokenNum Int + | TokenSym String + | TokenArrow + | TokenEq + | TokenAdd + | TokenSub + | TokenMul + | TokenLParen + | TokenRParen + | TokenEOF + deriving (Eq,Show) + +scanTokens = alexScanTokens + +} diff --git a/chapter9/happy/Main.hs b/chapter9/happy/Main.hs new file mode 100644 index 0000000..d4834f3 --- /dev/null +++ b/chapter9/happy/Main.hs @@ -0,0 +1,25 @@ +import Eval +import Parser (parseExpr, parseTokens) + +import Control.Monad.Trans +import System.Console.Haskeline + +process :: String -> IO () +process input = do + let tokens = parseTokens input + print tokens + let ast = parseExpr input + case ast of + Left err -> do + putStrLn "Parser Error:" + print err + Right ast -> print $ runEval ast + +main :: IO () +main = runInputT defaultSettings loop + where + loop = do + minput <- getInputLine "Happy> " + case minput of + Nothing -> outputStrLn "Goodbye." + Just input -> (liftIO $ process input) >> loop diff --git a/chapter9/happy/Makefile b/chapter9/happy/Makefile new file mode 100644 index 0000000..83ab200 --- /dev/null +++ b/chapter9/happy/Makefile @@ -0,0 +1,6 @@ +all: + alex Lexer.x + happy Parser.y + ghc --make Main -o Main +clean: + rm -f *.o *.hi Parser.hs Lexer.hs Main diff --git a/chapter9/happy/Parser.y b/chapter9/happy/Parser.y new file mode 100644 index 0000000..da837ca --- /dev/null +++ b/chapter9/happy/Parser.y @@ -0,0 +1,83 @@ +{ +{-# LANGUAGE GeneralizedNewtypeDeriving #-} + +module Parser ( + parseExpr, + parseTokens, +) where + +import Lexer +import Syntax + +import Control.Monad.Except + +} + +-- Entry point +%name expr + +-- Entry point +%name expr + +-- Lexer structure +%tokentype { Token } + +-- Parser monad +%monad { Except String } { (>>=) } { return } +%error { parseError } + +-- Token Names +%token + let { TokenLet } + true { TokenTrue } + false { TokenFalse } + in { TokenIn } + NUM { TokenNum $$ } + VAR { TokenSym $$ } + '\\' { TokenLambda } + '->' { TokenArrow } + '=' { TokenEq } + '+' { TokenAdd } + '-' { TokenSub } + '*' { TokenMul } + '(' { TokenLParen } + ')' { TokenRParen } + +-- Operators +%left '+' '-' +%left '*' +%% + +Expr : let VAR '=' Expr in Expr { App (Lam $2 $6) $4 } + | '\\' VAR '->' Expr { Lam $2 $4 } + | Form { $1 } + +Form : Form '+' Form { Op Add $1 $3 } + | Form '-' Form { Op Sub $1 $3 } + | Form '*' Form { Op Mul $1 $3 } + | Fact { $1 } + +Fact : Fact Atom { App $1 $2 } + | Atom { $1 } + +Atom : '(' Expr ')' { $2 } + | NUM { Lit (LInt $1) } + | VAR { Var $1 } + | true { Lit (LBool True) } + | false { Lit (LBool True) } + +{ + +parseError :: [Token] -> Except String a +parseError (l:ls) = throwError (show l) +parseError [] = throwError "Unexpected end of Input" + +parseExpr :: String -> Either String Expr +parseExpr input = + let tokenStream = scanTokens input in + runExcept (expr tokenStream) + +parseTokens :: String -> [Token] +parseTokens = scanTokens + +} diff --git a/chapter9/happy/Syntax.hs b/chapter9/happy/Syntax.hs new file mode 100644 index 0000000..6316009 --- /dev/null +++ b/chapter9/happy/Syntax.hs @@ -0,0 +1,19 @@ +module Syntax where + +type Name = String + +data Expr + = Lam Name Expr + | App Expr Expr + | Var Name + | Lit Lit + | Op Binop Expr Expr + deriving (Eq,Show) + +data Lit + = LInt Int + | LBool Bool + deriving (Show, Eq, Ord) + +data Binop = Add | Sub | Mul | Eql + deriving (Eq, Ord, Show) diff --git a/chapter9/happy/happy.cabal b/chapter9/happy/happy.cabal new file mode 100644 index 0000000..d7072a1 --- /dev/null +++ b/chapter9/happy/happy.cabal @@ -0,0 +1,27 @@ +name: happyParser +version: 0.1.0.0 +license: MIT +license-file: LICENSE +author: Stephen Diehl +maintainer: stephen.m.diehl@gmail.com +build-type: Simple +extra-source-files: README.md +cabal-version: >=1.10 + +executable happyParser + build-depends: + base >= 4.6 && <4.7 + , pretty >= 1.1 && <1.2 + , parsec >= 3.1 && <3.2 + , containers >= 0.5 && <0.6 + , haskeline >= 0.7 + , mtl + , transformers + default-language: Haskell2010 + main-is: Main.hs + + Build-depends: base, array + build-tools: alex, happy + other-modules: + Parser, + Lexer diff --git a/chapter9/layout/Layout.hs b/chapter9/layout/Layout.hs new file mode 100644 index 0000000..18337f9 --- /dev/null +++ b/chapter9/layout/Layout.hs @@ -0,0 +1,65 @@ +{-# LANGUAGE FlexibleContexts #-} + +module Layout ( + -- * Layout combinators + IParsec, + laidout, + indented, + align, + runIndentParser, +) where + +import Data.Text.Lazy + +import Text.Parsec (ParseError) +import Text.Parsec.Pos +import Text.Parsec.Prim hiding (State) + +import Control.Monad.Identity +import Control.Applicative ((<$>)) + +-- Indentation sensitive Parsec monad. +type IParsec a = Parsec Text ParseState a + +data ParseState = ParseState + { indents :: Column + } deriving (Show) + +initParseState :: ParseState +initParseState = ParseState 0 + +indentCmp + :: (Column -> Column -> Bool) + -> IParsec () +indentCmp cmp = do + col <- sourceColumn <$> getPosition + current <- indents <$> getState + guard (col `cmp` current) + +withIndent :: Monad m =>Column-> Column -> ParsecT s ParseState m b -> ParsecT s ParseState m b +withIndent cur pos m = do + modifyState $ \st -> st { indents = pos } + res <- m + modifyState $ \st -> st { indents = cur } + return res + +laidout :: Parsec s ParseState a -> Parsec s ParseState a +laidout m = do + cur <- indents <$> getState + pos <- sourceColumn <$> getPosition + res <- withIndent cur pos m + return res + +indented :: IParsec () +indented = indentCmp (>) "Block (indented)" + +align :: IParsec () +align = indentCmp (==) "Block (same indentation)" + +runIndentParser + :: Stream Text Identity a + => SourceName + -> IParsec a + -> Text + -> Either ParseError a +runIndentParser filePath p = runParser p initParseState filePath diff --git a/chapter9/provenance/Eval.hs b/chapter9/provenance/Eval.hs new file mode 100644 index 0000000..2c9c66c --- /dev/null +++ b/chapter9/provenance/Eval.hs @@ -0,0 +1,42 @@ +module Eval where + +import Syntax + +import Control.Monad.Identity +import qualified Data.Map as Map + +data Value + = VInt Integer + | VBool Bool + | VClosure String Expr TermEnv + +type TermEnv = Map.Map String Value +type Interpreter t = Identity t + +emptyTmenv :: TermEnv +emptyTmenv = Map.empty + +instance Show Value where + show (VInt n) = show n + show (VBool n) = show n + show VClosure{} = "<>" + +eval :: TermEnv -> Expr -> Interpreter Value +eval env expr = case expr of + Var _ x -> do + let Just v = Map.lookup x env + return v + + Lam _ x body -> + return (VClosure x body env) + + App _ fun arg -> do + VClosure x body clo <- eval env fun + argv <- eval env arg + let nenv = Map.insert x argv clo + eval nenv body + +runEval :: TermEnv -> String -> Expr -> (Value, TermEnv) +runEval env nm ex = + let res = runIdentity (eval env ex) in + (res, Map.insert nm res env) diff --git a/chapter9/provenance/Infer.hs b/chapter9/provenance/Infer.hs new file mode 100644 index 0000000..cbe844c --- /dev/null +++ b/chapter9/provenance/Infer.hs @@ -0,0 +1,205 @@ +{-# LANGUAGE FlexibleInstances #-} +{-# LANGUAGE FlexibleContexts #-} +{-# LANGUAGE TypeSynonymInstances #-} +{-# LANGUAGE GeneralizedNewtypeDeriving #-} + +module Infer ( + inferTop, + TypeError(..), + Env, +) where + +import Control.Monad.Except +import Control.Monad.Reader +import Control.Monad.Writer +import Control.Monad.State + +import qualified Data.Map as Map +import qualified Data.Set as Set + +import Syntax +import Type + +setLoc :: Loc -> Type -> Type +setLoc l (TVar _ a) = TVar l a +setLoc l (TCon _ a) = TCon l a +setLoc l (TArr _ a b) = TArr l a b + +getLoc :: Type -> Loc +getLoc (TVar l _) = l +getLoc (TCon l _) = l +getLoc (TArr l _ _) = l + +------------------------------------------------------------------------------- +-- Substitution +------------------------------------------------------------------------------- + +type Unifier = (Subst, [Constraint]) +type Constraint = (Type, Type) +type Env = [(Name, Type)] + +extend :: (Name, Type) -> Env -> Env +extend xt env = xt : env + +newtype Subst = Subst (Map.Map TVar Type) + deriving (Eq, Ord, Show, Monoid) + +class Substitutable a where + apply :: Subst -> a -> a + ftv :: a -> Set.Set TVar + +instance Substitutable Type where + apply _ (TCon l a) = TCon l a + apply (Subst s) t@(TVar l a) = Map.findWithDefault t a s + apply s (TArr l t1 t2) = TArr l (apply s t1) (apply s t2) + + ftv TCon{} = Set.empty + ftv (TVar _ a) = Set.singleton a + ftv (TArr _ t1 t2) = ftv t1 `Set.union` ftv t2 + +instance Substitutable Constraint where + apply s (t1, t2) = (apply s t1, apply s t2) + ftv (t1, t2) = ftv t1 `Set.union` ftv t2 + +instance Substitutable a => Substitutable [a] where + apply = map . apply + ftv = foldr (Set.union . ftv) Set.empty + +data TypeError + = UnificationFail Type Loc Type Loc + | InfiniteType TVar Type + | UnboundVariable String + | Ambigious [Constraint] + | UnificationMismatch [Type] [Type] + +------------------------------------------------------------------------------- +-- Environment +------------------------------------------------------------------------------- + +inEnv :: (Name, Type) -> Check a -> Check a +inEnv (x,t) = local (extend (x,t)) + +lookupVar :: Name -> Check Type +lookupVar x = do + env <- ask + case lookup x env of + Nothing -> throwError $ UnboundVariable x + Just s -> return s + +letters :: [String] +letters = [1..] >>= flip replicateM ['a'..'z'] + +fresh :: Loc -> Check Type +fresh l = do + s <- get + put s{count = count s + 1} + return $ TVar l (TV (letters !! count s)) + +------------------------------------------------------------------------------- +-- Type Checker +------------------------------------------------------------------------------- + +-- | Inference state +data InferState = InferState { count :: Int } +type Check = + WriterT [Constraint] + (StateT InferState + (ExceptT TypeError (Reader Env))) + +check :: Expr -> Check Type +check expr = case expr of + Var l n -> do + t <- lookupVar n + return $ setLoc l t + + App l a b -> do + ta <- check a + tb <- check b + tr <- fresh l + unify ta (TArr l tb tr) + return tr + + Lam l n a -> do + tv <- fresh l + ty <- inEnv (n, tv) (check a) + return (TArr l ty (setLoc l tv)) + + Lit l _ -> return $ TCon l "Int" + +runCheck :: Env -> Check a -> Either TypeError (a, [Constraint]) +runCheck env = + flip runReader env + . runExceptT + . flip evalStateT (InferState 0) + . runWriterT + +inferTop :: Env -> Expr -> Either TypeError Type +inferTop env x = do + (ty, cs) <- runCheck env (check x) + s <- runSolve cs + return (apply s ty) + +------------------------------------------------------------------------------- +-- Constraint Solving +------------------------------------------------------------------------------- + +type Solve = StateT Unifier (Except TypeError) + +runSolve :: [Constraint] -> Either TypeError Subst +runSolve cs = runExcept (evalStateT solver st) + where st = (emptySubst, cs) + +-- Unification solver +solver :: Solve Subst +solver = do + (su, cs) <- get + case cs of + [] -> return su + ((t1, t2): cs0) -> do + (su1, cs1) <- unifies t1 t2 + put (su1 `compose` su, cs1 ++ (apply su1 cs0)) + solver + +-- | Empty unifier +emptyUnifer :: Unifier +emptyUnifer = (emptySubst, []) + +-- | The empty substitution +emptySubst :: Subst +emptySubst = mempty + +-- | Compose substitutions +compose :: Subst -> Subst -> Subst +(Subst s1) `compose` (Subst s2) = Subst $ Map.map (apply (Subst s1)) s2 `Map.union` s1 + +bind :: TVar -> Type -> Solve Unifier +bind a t + | eqLoc t a = return (emptySubst, []) + | occursCheck a t = throwError $ InfiniteType a t + | otherwise = return $ (Subst $ Map.singleton a t, []) + +eqLoc :: Type -> TVar -> Bool +eqLoc (TVar _ a) b = a == b +eqLoc _ _ = False + +occursCheck :: Substitutable a => TVar -> a -> Bool +occursCheck a t = a `Set.member` ftv t + +unifies :: Type -> Type -> Solve Unifier +unifies t1 t2 | t1 == t2 = return emptyUnifer +unifies (TVar _ v) t = v `bind` t +unifies t (TVar _ v) = v `bind` t +unifies (TArr _ t1 t2) (TArr _ t3 t4) = unifyMany [t1, t2] [t3, t4] +unifies t1 t2 = throwError $ UnificationFail t1 (getLoc t1) t2 (getLoc t2) + +unifyMany :: [Type] -> [Type] -> Solve Unifier +unifyMany [] [] = return emptyUnifer +unifyMany (t1 : ts1) (t2 : ts2) = + do (su1,cs1) <- unifies t1 t2 + (su2,cs2) <- unifyMany (apply su1 ts1) (apply su1 ts2) + return (su2 `compose` su1, cs1 ++ cs2) +unifyMany t1 t2 = throwError $ UnificationMismatch t1 t2 + +-- | Unify two types +unify :: Type -> Type -> Check () +unify t1 t2 = tell [(t1, t2)] diff --git a/chapter9/provenance/Lexer.hs b/chapter9/provenance/Lexer.hs new file mode 100644 index 0000000..41dfc8f --- /dev/null +++ b/chapter9/provenance/Lexer.hs @@ -0,0 +1,73 @@ +module Lexer where + +import Text.Parsec +import Text.Parsec.Text.Lazy +import qualified Data.Text.Lazy as L +import qualified Text.Parsec.Token as Tok +import qualified Text.Parsec.Expr as Ex + +import Data.Functor.Identity + +type Op a = Ex.Operator L.Text () Identity a +type Operators a = Ex.OperatorTable L.Text () Identity a + +reservedNames :: [String] +reservedNames = [ + "let", + "in", + "fix", + "rec", + "if", + "then", + "else" + ] + +reservedOps :: [String] +reservedOps = [ + "->", + "\\", + "+", + "*", + "-", + "=" + ] + +lexer :: Tok.GenTokenParser L.Text () Identity +lexer = Tok.makeTokenParser $ Tok.LanguageDef + { Tok.commentStart = "{-" + , Tok.commentEnd = "-}" + , Tok.commentLine = "--" + , Tok.nestedComments = True + , Tok.identStart = letter + , Tok.identLetter = alphaNum <|> oneOf "_'" + , Tok.opStart = oneOf ":!#$%&*+./<=>?@\\^|-~" + , Tok.opLetter = oneOf ":!#$%&*+./<=>?@\\^|-~" + , Tok.reservedNames = reservedNames + , Tok.reservedOpNames = reservedOps + , Tok.caseSensitive = True + } + +reserved :: String -> Parser () +reserved = Tok.reserved lexer + +reservedOp :: String -> Parser () +reservedOp = Tok.reservedOp lexer + +identifier :: Parser String +identifier = Tok.identifier lexer + +parens :: Parser a -> Parser a +parens = Tok.parens lexer + +semiSep :: Parser a -> Parser [a] +semiSep = Tok.semiSep lexer + +semi :: Parser String +semi = Tok.semi lexer + +contents :: Parser a -> Parser a +contents p = do + Tok.whiteSpace lexer + r <- p + eof + return r diff --git a/chapter9/provenance/Main.hs b/chapter9/provenance/Main.hs new file mode 100644 index 0000000..553a006 --- /dev/null +++ b/chapter9/provenance/Main.hs @@ -0,0 +1,149 @@ +{-# LANGUAGE FlexibleContexts #-} +{-# LANGUAGE FlexibleInstances #-} +{-# LANGUAGE OverloadedStrings #-} +{-# LANGUAGE TypeSynonymInstances #-} + +import Syntax +import Infer +import Parser +import Pretty + +import Data.Monoid +import qualified Data.Map as Map +import qualified Data.Text.Lazy as L +import qualified Data.Text.Lazy.IO as L + +import Control.Monad.Identity +import Control.Monad.State.Strict + +import Data.List (isPrefixOf) + +import System.Exit +import System.Environment +import System.Console.Repline + +------------------------------------------------------------------------------- +-- Types +------------------------------------------------------------------------------- + +data IState = IState + { tyctx :: Env -- Type environment + } + +initState :: IState +initState = IState [] + +type Repl a = HaskelineT (StateT IState IO) a + +hoistErr :: Show e => Either e a -> Repl a +hoistErr (Right val) = return val +hoistErr (Left err) = do + liftIO $ print err + abort + +------------------------------------------------------------------------------- +-- Execution +------------------------------------------------------------------------------- + +exec :: Bool -> L.Text -> Repl () +exec update source = do + -- Get the current interpreter state + st <- get + + -- Parser ( returns AST ) + mod <- hoistErr $ parseExpr source + + -- Type Inference ( returns Typing Environment ) + tyctx' <- hoistErr $ inferTop (tyctx st) mod + liftIO $ putStrLn (pptype tyctx') + return () + +showOutput :: String -> IState -> Repl () +showOutput arg st = do + case lookup "it" (tyctx st) of + Just val -> liftIO $ putStrLn $ ppsignature (arg, val) + Nothing -> return () + +cmd :: String -> Repl () +cmd source = exec True (L.pack source) + +------------------------------------------------------------------------------- +-- Commands +------------------------------------------------------------------------------- + +-- :browse command +browse :: [String] -> Repl () +browse _ = do + st <- get + undefined + {-liftIO $ mapM_ putStrLn $ ppenv (tyctx st)-} + +-- :load command +load :: [String] -> Repl () +load args = do + contents <- liftIO $ L.readFile (unwords args) + exec True contents + +-- :type command +typeof :: [String] -> Repl () +typeof args = do + st <- get + let arg = unwords args + case lookup arg (tyctx st) of + Just val -> liftIO $ putStrLn $ ppsignature (arg, val) + Nothing -> exec False (L.pack arg) + +-- :quit command +quit :: a -> Repl () +quit _ = liftIO $ exitSuccess + +------------------------------------------------------------------------------- +-- Interactive Shell +------------------------------------------------------------------------------- + +-- Prefix tab completer +defaultMatcher :: MonadIO m => [(String, CompletionFunc m)] +defaultMatcher = [ + (":load" , fileCompleter) + --, (":type" , values) + ] + +-- Default tab completer +comp :: (Monad m, MonadState IState m) => WordCompleter m +comp n = do + let cmds = [":load", ":type", ":browse", ":quit"] + ctx <- gets tyctx + let defs = fmap fst ctx + return $ filter (isPrefixOf n) (cmds ++ defs) + +options :: [(String, [String] -> Repl ())] +options = [ + ("load" , load) + , ("browse" , browse) + , ("quit" , quit) + , ("type" , Main.typeof) + ] + +------------------------------------------------------------------------------- +-- Entry Point +------------------------------------------------------------------------------- + +completer :: CompleterStyle (StateT IState IO) +completer = Prefix (wordCompleter comp) defaultMatcher + +shell :: Repl a -> IO () +shell pre = flip evalStateT initState + $ evalRepl "Poly> " cmd options completer pre + +------------------------------------------------------------------------------- +-- Toplevel +------------------------------------------------------------------------------- + +main :: IO () +main = do + args <- getArgs + case args of + [] -> shell (return ()) + [fname] -> shell (load [fname]) + ["test", fname] -> shell (load [fname] >> browse [] >> quit ()) + _ -> putStrLn "invalid arguments" diff --git a/chapter9/provenance/Parser.hs b/chapter9/provenance/Parser.hs new file mode 100644 index 0000000..c3ed8f0 --- /dev/null +++ b/chapter9/provenance/Parser.hs @@ -0,0 +1,77 @@ +{-# LANGUAGE OverloadedStrings #-} + +module Parser ( + parseExpr, + parseModule +) where + +import Text.Parsec +import Text.Parsec.Text.Lazy (Parser) + +import qualified Text.Parsec.Expr as Ex +import qualified Text.Parsec.Token as Tok + +import qualified Data.Text.Lazy as L + +import Lexer +import Syntax + +import Control.Applicative ((<$>)) + +integer :: Parser Integer +integer = Tok.integer lexer + +variable :: Parser Expr +variable = do + x <- identifier + l <- sourceLine <$> getPosition + return (Var (Located l) x) + +number :: Parser Expr +number = do + n <- integer + l <- sourceLine <$> getPosition + return (Lit (Located l) (fromIntegral n)) + +lambda :: Parser Expr +lambda = do + reservedOp "\\" + args <- many identifier + reservedOp "->" + body <- expr + l <- sourceLine <$> getPosition + return $ foldr (Lam (Located l)) body args + +aexp :: Parser Expr +aexp = parens expr + <|> lambda + <|> number + <|> variable + +expr :: Parser Expr +expr = do + es <- many1 aexp + l <- sourceLine <$> getPosition + return (foldl1 (App (Located l)) es) + +type Binding = (String, Expr) + +val :: Parser Binding +val = do + ex <- expr + return ("it", ex) + +top :: Parser Binding +top = do + x <- val + optional semi + return x + +modl :: Parser [Binding] +modl = many top + +parseExpr :: L.Text -> Either ParseError Expr +parseExpr input = parse (contents expr) "" input + +parseModule :: FilePath -> L.Text -> Either ParseError [(String, Expr)] +parseModule fname input = parse (contents modl) fname input diff --git a/chapter9/provenance/Pretty.hs b/chapter9/provenance/Pretty.hs new file mode 100644 index 0000000..9db1e3d --- /dev/null +++ b/chapter9/provenance/Pretty.hs @@ -0,0 +1,82 @@ +{-# Language FlexibleInstances #-} +{-# LANGUAGE OverloadedStrings #-} +{-# Language TypeSynonymInstances #-} + +module Pretty ( + ppdecl, + ppexpr, + ppsignature, + pptype +) where + +import Type +import Syntax +import Infer + +import Text.PrettyPrint + +parensIf :: Bool -> Doc -> Doc +parensIf True = parens +parensIf False = id + +class Pretty p where + ppr :: Int -> p -> Doc + pp :: p -> Doc + pp = ppr 0 + +instance Pretty Name where + ppr _ x = text x + +instance Pretty TVar where + ppr _ (TV x) = text x + +instance Pretty Type where + ppr p (TArr _ a b) = (parensIf (isArrow a) (ppr p a)) <+> text "->" <+> ppr p b + where + isArrow TArr{} = True + isArrow _ = False + ppr p (TVar _ a) = ppr p a + ppr _ (TCon _ a) = text a + +instance Pretty Expr where + ppr p (Var _ a) = ppr p a + ppr p (App _ a b) = parensIf (p > 0) $ ppr (p+1) a <+> ppr p b + ppr p (Lam _ a b) = text "\\" <> ppr p a <+> text "->" <+> ppr p b + ppr _ (Lit _ a) = int a + +instance Pretty Loc where + ppr p (NoLoc) = "" + ppr p (Located n) = int n + +instance Show TypeError where + show (UnificationFail a la b lb) = + concat [ + "Cannot unify types: \n\t" + , pptype a + , "\n\tIntroduced at: " + , (pploc la) + , "\nwith \n\t" + , pptype b + , "\n\tIntroduced at: " + , (pploc lb) + ] + show (InfiniteType (TV a) b) = + concat ["Cannot construct the the infinite type: ", a, " = ", pptype b] + show (Ambigious cs) = + concat ["Cannot not match expected type: '" ++ pptype a ++ "' with actual type: '" ++ pptype b ++ "'\n" | (a,b) <- cs] + show (UnboundVariable a) = "Not in scope: " ++ a + +pploc :: Loc -> String +pploc = render . ppr 0 + +pptype :: Type -> String +pptype = render . ppr 0 + +ppexpr :: Expr -> String +ppexpr = render . ppr 0 + +ppsignature :: (String, Type) -> String +ppsignature (a, b) = a ++ " : " ++ pptype b + +ppdecl :: (String, Expr) -> String +ppdecl (a, b) = "let " ++ a ++ " = " ++ ppexpr b diff --git a/chapter9/provenance/Syntax.hs b/chapter9/provenance/Syntax.hs new file mode 100644 index 0000000..360dfb6 --- /dev/null +++ b/chapter9/provenance/Syntax.hs @@ -0,0 +1,15 @@ +module Syntax ( + Expr(..), + Name, + Loc(..), +) where + +type Name = String +data Loc = NoLoc | Located Int + deriving (Show, Eq, Ord) + +data Expr + = Var Loc Name + | App Loc Expr Expr + | Lam Loc Name Expr + | Lit Loc Int diff --git a/chapter9/provenance/Type.hs b/chapter9/provenance/Type.hs new file mode 100644 index 0000000..bc723af --- /dev/null +++ b/chapter9/provenance/Type.hs @@ -0,0 +1,15 @@ +module Type ( + Type(..), + TVar(..), +) where + +import Syntax (Loc, Name) + +data Type + = TVar Loc TVar + | TCon Loc Name + | TArr Loc Type Type + deriving (Show, Eq, Ord) + +newtype TVar = TV String + deriving (Show, Eq, Ord) diff --git a/contributing.md b/contributing.md index c70523c..21491aa 100644 --- a/contributing.md +++ b/contributing.md @@ -1,3 +1,6 @@ +Contributing +============ + As always, I rely on the perpetual kindness and goodwill of Haskellers (like you!) to help correct grammar, clarify, and fix errors. @@ -11,6 +14,10 @@ repo and then submit a pull request on Github. There should be no need to compile the text locally. I will try to merge the changes quickly and rebuild the text daily. +If you would like to add your name to +[CONTRIBUTORS.md](https://github.com/sdiehl/write-you-a-haskell/blob/master/CONTRIBUTORS.md) +submit this along with your pull request. + *Complex Fixes* If you'd like to submit a change to the publishing software around the text, @@ -59,6 +66,28 @@ syntax highlighting. *Math Typesetting* +Equations can be included in display form: + +```latex +$$ +\int_\Omega \mathrm{d}\omega = \oint_{\partial \Omega} \omega +$$ +``` + +$$ +\int_\Omega \mathrm{d}\omega = \oint_{\partial \Omega} \omega +$$ + +Or in inline form (like $a^2 + b^2 = c^2$) with single dollar signs. Specially +there must be no spaces around the dollar signs otherwise Pandoc will not parse +it properly. + +```latex +$a^2 + b^2 = c^2$ +``` + +For most definitions, the ``aligned`` block is used: + ```latex $$ \begin{aligned} @@ -105,7 +134,7 @@ Typography LaTeX ----- -The $\latex$ styling is sourced from the ``template.latex`` file, which is an +The LaTeX styling is sourced from the ``template.latex`` file, which is an extension of Pandoc's default template with some custom modifications. Images diff --git a/css/style.css b/css/style.css index 2b3543e..cb218e3 100644 --- a/css/style.css +++ b/css/style.css @@ -23,6 +23,18 @@ pre code { font: 15px/19px Inconsolata, Monaco,"Lucida Console",Terminal,"Courier New",Courier; } +.figure { + text-align: center; +} + +.pagetitle .figure { + text-align: left !important; +} + +.pagetitle .figure img { + height: 36px; +} + table th { border-right: 1em solid transparent; } diff --git a/img/Haskell-Logo.ps b/img/Haskell-Logo.ps new file mode 100644 index 0000000..2faae74 --- /dev/null +++ b/img/Haskell-Logo.ps @@ -0,0 +1,96 @@ +%!PS-Adobe-3.0 +%%Creator: cairo 1.10.2 (http://cairographics.org) +%%CreationDate: Wed Jan 14 11:07:43 2015 +%%Pages: 1 +%%BoundingBox: 0 0 490 349 +%%DocumentData: Clean7Bit +%%LanguageLevel: 2 +%%DocumentMedia: 173x123mm 490 348 0 () () +%%EndComments +%%BeginProlog +/languagelevel where +{ pop languagelevel } { 1 } ifelse +2 lt { /Helvetica findfont 12 scalefont setfont 50 500 moveto + (This print job requires a PostScript Language Level 2 printer.) show + showpage quit } if +/q { gsave } bind def +/Q { grestore } bind def +/cm { 6 array astore concat } bind def +/w { setlinewidth } bind def +/J { setlinecap } bind def +/j { setlinejoin } bind def +/M { setmiterlimit } bind def +/d { setdash } bind def +/m { moveto } bind def +/l { lineto } bind def +/c { curveto } bind def +/h { closepath } bind def +/re { exch dup neg 3 1 roll 5 3 roll moveto 0 rlineto + 0 exch rlineto 0 rlineto closepath } bind def +/S { stroke } bind def +/f { fill } bind def +/f* { eofill } bind def +/n { newpath } bind def +/W { clip } bind def +/W* { eoclip } bind def +/BT { } bind def +/ET { } bind def +/pdfmark where { pop globaldict /?pdfmark /exec load put } + { globaldict begin /?pdfmark /pop load def /pdfmark + /cleartomark load def end } ifelse +/BDC { mark 3 1 roll /BDC pdfmark } bind def +/EMC { mark /EMC pdfmark } bind def +/cairo_store_point { /cairo_point_y exch def /cairo_point_x exch def } def +/Tj { show currentpoint cairo_store_point } bind def +/TJ { + { + dup + type /stringtype eq + { show } { -0.001 mul 0 cairo_font_matrix dtransform rmoveto } ifelse + } forall + currentpoint cairo_store_point +} bind def +/cairo_selectfont { cairo_font_matrix aload pop pop pop 0 0 6 array astore + cairo_font exch selectfont cairo_point_x cairo_point_y moveto } bind def +/Tf { pop /cairo_font exch def /cairo_font_matrix where + { pop cairo_selectfont } if } bind def +/Td { matrix translate cairo_font_matrix matrix concatmatrix dup + /cairo_font_matrix exch def dup 4 get exch 5 get cairo_store_point + /cairo_font where { pop cairo_selectfont } if } bind def +/Tm { 2 copy 8 2 roll 6 array astore /cairo_font_matrix exch def + cairo_store_point /cairo_font where { pop cairo_selectfont } if } bind def +/g { setgray } bind def +/rg { setrgbcolor } bind def +/d1 { setcachedevice } bind def +%%EndProlog +%%Page: 1 1 +%%BeginPageSetup +%%PageMedia: 173x123mm +%%PageBoundingBox: 0 0 490 349 +%%EndPageSetup +q 0 0 490 349 rectclip q +1 g +8 w +0 J +1 j +[] 0.0 d +4 M q 1 0 0 -1 0 348.156311 cm +4.016 344.156 m 117.402 174.078 l 4.016 4 l 89.055 4 l 202.441 174.078 +l 89.055 344.156 l h +4.016 344.156 m S Q +q 1 0 0 -1 0 348.156311 cm +117.402 344.156 m 230.789 174.078 l 117.402 4 l 202.441 4 l 429.211 +344.156 l 344.172 344.156 l 273.309 237.859 l 202.441 344.156 l h +117.402 344.156 m S Q +q 1 0 0 -1 0 348.156311 cm +391.418 244.945 m 353.625 188.254 l 485.906 188.25 l 485.906 244.945 l +h +391.418 244.945 m S Q +q 1 0 0 -1 0 348.156311 cm +334.727 159.906 m 296.93 103.215 l 485.906 103.211 l 485.906 159.906 l +h +334.727 159.906 m S Q +Q Q +showpage +%%Trailer +%%EOF diff --git a/img/lambda.png b/img/lambda.png index 2b1807a..5eb59b8 100644 Binary files a/img/lambda.png and b/img/lambda.png differ diff --git a/img/pipeline1.dot b/img/pipeline1.dot index e638ceb..2df9dcd 100644 --- a/img/pipeline1.dot +++ b/img/pipeline1.dot @@ -1,4 +1,5 @@ digraph G { + graph [ dpi = 300 ]; rankdir=LR node [shape=box] Source -> Parsing -> Desugar -> "Type Checking" -> Transformation -> Compilation diff --git a/img/pipeline1.png b/img/pipeline1.png index 9821090..0f6e4ef 100644 Binary files a/img/pipeline1.png and b/img/pipeline1.png differ diff --git a/img/pipeline2.dot b/img/pipeline2.dot index 7e35b49..13d70bc 100644 --- a/img/pipeline2.dot +++ b/img/pipeline2.dot @@ -1,4 +1,5 @@ digraph G { + graph [ dpi = 300 ]; rankdir=LR node [shape=box] Frontend -> "Core Language" -> "Compiler IR" -> "Machine Code" diff --git a/img/pipeline2.png b/img/pipeline2.png index ead3172..e26403f 100644 Binary files a/img/pipeline2.png and b/img/pipeline2.png differ diff --git a/img/proto.dot b/img/proto.dot index 80fb009..a079055 100644 --- a/img/proto.dot +++ b/img/proto.dot @@ -1,4 +1,5 @@ digraph G { + graph [ dpi = 300 ]; rankdir=LR node [shape=box] Source -> Parsing -> Desugar -> Inference -> Transformation -> Compliation diff --git a/img/proto_pass.dot b/img/proto_pass.dot index 962d4bb..cf9b57f 100644 --- a/img/proto_pass.dot +++ b/img/proto_pass.dot @@ -1,4 +1,5 @@ digraph G { + graph [ dpi = 300 ]; rankdir=LR node [shape=box] Parse -> Rename -> Typecheck -> Desugar -> ToCore -> Evaluate diff --git a/img/proto_pass.png b/img/proto_pass.png index 02b4d92..e12ad0a 100644 Binary files a/img/proto_pass.png and b/img/proto_pass.png differ diff --git a/img/protohaskell.dot b/img/protohaskell.dot index fbf79d3..2449d3e 100644 --- a/img/protohaskell.dot +++ b/img/protohaskell.dot @@ -1,4 +1,5 @@ digraph G { + graph [ dpi = 300 ]; rankdir=LR node [shape=box] Source -> Frontend -> Core -> PHOAS diff --git a/img/protohaskell.png b/img/protohaskell.png index 3ce1add..0258583 100644 Binary files a/img/protohaskell.png and b/img/protohaskell.png differ diff --git a/img/stack.dot b/img/stack.dot index 07cdc24..54a0e31 100644 --- a/img/stack.dot +++ b/img/stack.dot @@ -1,4 +1,5 @@ digraph G { + graph [ dpi = 72 ]; rankdir=TB node [shape=box] WriterT -> IO [label = " execWriterT"] diff --git a/img/stack.png b/img/stack.png index 44b6c82..e529da5 100644 Binary files a/img/stack.png and b/img/stack.png differ diff --git a/img/titles/basics.png b/img/titles/basics.png index b1d83b6..a7d89bb 100644 Binary files a/img/titles/basics.png and b/img/titles/basics.png differ diff --git a/img/titles/evaluation.png b/img/titles/evaluation.png index fb94252..58b1106 100644 Binary files a/img/titles/evaluation.png and b/img/titles/evaluation.png differ diff --git a/img/titles/extended_parser.png b/img/titles/extended_parser.png index 1f1ed0f..d6b4a3f 100644 Binary files a/img/titles/extended_parser.png and b/img/titles/extended_parser.png differ diff --git a/img/titles/hindley_milner.png b/img/titles/hindley_milner.png index 114d291..12b533a 100644 Binary files a/img/titles/hindley_milner.png and b/img/titles/hindley_milner.png differ diff --git a/img/titles/introduction.png b/img/titles/introduction.png index f2526f1..9408411 100644 Binary files a/img/titles/introduction.png and b/img/titles/introduction.png differ diff --git a/img/titles/lambda_calculus.png b/img/titles/lambda_calculus.png index ee7746f..fc05885 100644 Binary files a/img/titles/lambda_calculus.png and b/img/titles/lambda_calculus.png differ diff --git a/img/titles/llvm.png b/img/titles/llvm.png new file mode 100644 index 0000000..a5cbfe4 Binary files /dev/null and b/img/titles/llvm.png differ diff --git a/img/titles/parsing.png b/img/titles/parsing.png index 60389d2..ec2a2a7 100644 Binary files a/img/titles/parsing.png and b/img/titles/parsing.png differ diff --git a/img/titles/protohaskell.png b/img/titles/protohaskell.png index fc09dfc..477c21e 100644 Binary files a/img/titles/protohaskell.png and b/img/titles/protohaskell.png differ diff --git a/img/titles/systemf.png b/img/titles/systemf.png new file mode 100644 index 0000000..0623cff Binary files /dev/null and b/img/titles/systemf.png differ diff --git a/img/titles/type_systems.png b/img/titles/type_systems.png index 81df394..6125b99 100644 Binary files a/img/titles/type_systems.png and b/img/titles/type_systems.png differ diff --git a/template.latex b/template.latex index 96ec1d8..fee4eea 100644 --- a/template.latex +++ b/template.latex @@ -1,5 +1,16 @@ \documentclass[$if(fontsize)$$fontsize$,$endif$$if(lang)$$lang$,$endif$$if(papersize)$$papersize$,$endif$$for(classoption)$$classoption$$sep$,$endfor$]{$documentclass$} +\usepackage{geometry} +\usepackage{xcolor} +\usepackage{graphicx} \usepackage[labelformat=empty]{caption} +\usepackage{afterpage} + +\newcommand\blankpage{% + \null + \thispagestyle{empty}% + \addtocounter{page}{-1}% + \newpage} + $if(fontfamily)$ \usepackage{$fontfamily$} $else$ @@ -34,11 +45,20 @@ $if(sansfont)$ \setsansfont{$sansfont$} $endif$ $if(monofont)$ - \setmonofont[Mapping=tex-ansi]{$monofont$} + %\setmonofont[Mapping=tex-ansi]{$monofont$} + % custom override $endif$ + $if(mathfont)$ \setmathfont(Digits,Latin,Greek){$mathfont$} $endif$ + +\usepackage{fontspec} +\setmainfont[Ligatures=Common, + ItalicFont={Adobe Garamond Pro Italic}] + {Adobe Garamond Pro} +\setmonofont[Ligatures=NoCommon]{Source Code Pro} + \fi % use upquote if available, for straight quotes in verbatim environments \IfFileExists{upquote.sty}{\usepackage{upquote}}{} @@ -106,7 +126,6 @@ $if(graphics)$ \fi } \makeatother - \setkeys{Gin}{width=\ScaleWidthIfNeeded,height=\ScaleHeightIfNeeded,keepaspectratio}% $endif$ \ifxetex @@ -160,11 +179,88 @@ $endfor$ \begin{document} $if(title)$ +% ---------- +% Title page +% ---------- + +\begin{titlepage} + +\definecolor{titlepagecolor}{cmyk}{1,.60,0,.40} +\definecolor{namecolor}{cmyk}{1,.50,0,.10} +\newgeometry{left=7.5cm} %defines the geometry for the titlepage +\pagecolor{titlepagecolor} \begin{figure} \centering - \includegraphics[width=4cm]{img/Haskell-Logo.png} + \includegraphics[height=3in,width=3in]{img/Haskell-Logo.ps} \end{figure} -\maketitle + +\color{white} +\makebox[0pt][l]{\rule{1.3\textwidth}{1pt}} +\par +\noindent +{\huge \textsf{Write You a Haskell}} +\par +\noindent +{\textit{\textsf{Building a modern functional compiler from first principles}}} +%\textbf{\textsf{Something}} \textcolor{namecolor}{\textsf{Else}} +\vfill +\noindent +%{\huge \textsf{Write You a Haskell}} +\vskip\baselineskip +\noindent +{\huge \textsf{Stephen Diehl}} +\\ +\textsf{January 2015 (Draft)} + +\end{titlepage} +\pagecolor{white} +\restoregeometry % restores the geometry + +% Filler page +%\null +%\thispagestyle{empty} +%\addtocounter{page}{-1} +%\newpage + +% Subtitle page +%\vfill{3in} +%\begin{centering} +%{\HUGE \textsf{Write You a Haskell}} +%\end{centering} + +%\thispagestyle{empty} +%\addtocounter{page}{-1} +%\newpage + +% Copyright page +\thispagestyle{empty} +\addtocounter{page}{-1} + +\begin{minipage}[b]{0.9\textwidth} +\footnotesize\raggedright +\setlength{\parskip}{0.5\baselineskip} + +\begin{textsf} +{\textbf{Write You a Haskell}}\\ +by Stephen Diehl + +\par +Copyright \copyright\ 2013-2015. \\ +\href{http://www.stephendiehl.com}{www.stephendiehl.com} + +\par +This written work is licensed under a Creative Commons +Attribution-NonCommercial-ShareAlike 4.0 International License. You may +reproduce and edit this work with attribution for all non-commercial purposes. +\par +The included source is released under the terms of the MIT License. +\par +Git commit: f09d210be253a05fc8ad0827cd72ffa32404e2ba +\end{textsf} +\end{minipage} +\vspace*{2\baselineskip} +\cleardoublepage + $endif$ $if(abstract)$ \begin{abstract} diff --git a/title.md b/title.md index 7989766..ff99f31 100644 --- a/title.md +++ b/title.md @@ -1,3 +1,47 @@ % Write You a Haskell % Stephen Diehl % 1/2/2015 + +