diff --git a/003_lambda_calculus.md b/003_lambda_calculus.md index 3158f3a..05a832f 100644 --- a/003_lambda_calculus.md +++ b/003_lambda_calculus.md @@ -43,7 +43,7 @@ $$ The lambda calculus is often called the "assembly language" of functional programming, and variations and extensions on it form the basis of many functional compiler intermediate forms for languages like Haskell, OCaml, -StandardML, etc. The variation we will discuss first is known as **untyped +Standard ML, etc. The variation we will discuss first is known as **untyped lambda calculus**, by contrast later we will discuss the **typed lambda calculus** which is an extension thereof. @@ -62,9 +62,9 @@ x_1\ x_2\ x_3\ ... x_n = (... ((x_1 x_2 )x_3 ) ... x_n ) $$ By convention application extends as far to the right as is syntactically -meaningful. Parenthesis are used to disambiguate. +meaningful. Parentheses are used to disambiguate. -In the lambda calculus all lambda abstractions bind a single variable, their +In the lambda calculus, each lambda abstraction binds a single variable, and the lambda abstraction's body may be another lambda abstraction. Out of convenience we often write multiple lambda abstractions with their variables on one lambda symbol. This is merely a syntactical convention and does not change the underlying @@ -75,8 +75,8 @@ $$ $$ The actual implementation of the lambda calculus admits several degrees of -freedom in how they are represented. The most notable is the choice of -identifier for the binding variables. A variable is said to be *bound* if it is +freedom in how lambda abstractions are represented. The most notable is the choice of +identifiers for the binding variables. A variable is said to be *bound* if it is contained in a lambda expression of the same variable binding. Conversely a variable is *free* if it is not bound. @@ -90,12 +90,12 @@ e_1 &= \lambda x. (x (\lambda y. y a) x) y \\ \end{aligned} $$ -$e_0$ is a combinator while $e_1$ is not. In $e_1$ both occurances of $x$ are bound. The first $y$ is bound, +$e_0$ is a combinator while $e_1$ is not. In $e_1$ both occurrences of $x$ are bound. The first $y$ is bound, while the second is free. $a$ is also free. Multiple lambda abstractions may bind the same variable name. -Each occurance of a variable is then bound by the nearest enclosing binder. -For example the $x$ variable in the following expression is +Each occurrence of a variable is then bound by the nearest enclosing binder. +For example, the $x$ variable in the following expression is bound on the inner lambda, while $y$ is bound on the outer lambda. This phenomenon is referred to as *name shadowing*. @@ -124,7 +124,7 @@ k x y = x i x = x ``` -Rather remarkably Moses Schönfinkel showed that all closed lambda expression can be expressed in terms of only the +Rather remarkably Moses Schönfinkel showed that all closed lambda expressions can be expressed in terms of only the **S** and **K** combinators - even the **I** combinator. For example one can easily show that **SKK** reduces to **I**. diff --git a/006_hindley_milner.md b/006_hindley_milner.md index 3a6eaa1..b4f548e 100644 --- a/006_hindley_milner.md +++ b/006_hindley_milner.md @@ -40,7 +40,7 @@ $$ Milner's observation was that since the typing rules map uniquely onto syntax, we can in effect run the typing rules "backwards" and whenever we don't have a -known type for an subexpression, we "guess" by putting a fresh variable in its +known type for a subexpression, we "guess" by putting a fresh variable in its place, collecting constraints about its usage induced by subsequent typing judgements. This is the essence of *type inference* in the ML family of languages, that by the generation and solving of a class of unification problems @@ -107,11 +107,11 @@ Polymorphism ------------ We will add an additional constructs to our language that will admit a new form -of *polymorphism* for our language. Polymorphism is property of a term to +of *polymorphism* for our language. Polymorphism is the property of a term to simultaneously admit several distinct types for the same function implementation. -For instance the polymorphic signature for the identity function maps a input of +For instance the polymorphic signature for the identity function maps an input of type $\alpha$ $$ @@ -134,19 +134,18 @@ $$ A rather remarkably fact of universal quantification is that many properties about inhabitants of a type are guaranteed by construction, these are the -so-called *free theorems*. For instance the only (nonpathological) -implementation that can inhabit a function of type ``(a, b) -> a`` is an -implementation precisely identical to that of ``fst``. +so-called *free theorems*. For instance any (nonpathological) +inhabitant of the type ``(a, b) -> a`` must be equivalent to ``fst``. A slightly less trivial example is that of the ``fmap`` function of type -``Functor f => (a -> b) -> f a -> f b``. The second functor law states that. +``Functor f => (a -> b) -> f a -> f b``. The second functor law demands that: ```haskell forall f g. fmap f . fmap g = fmap (f . g) ``` However it is impossible to write down a (nonpathological) function for ``fmap`` -that was well-typed and didn't have this property. We get the theorem for free! +that has the required type and doesn't have this property. We get the theorem for free! Types ----- @@ -198,7 +197,7 @@ $$ $$ We've now divided our types into two syntactic categories, the *monotypes* and -*polytypes*. In our simple initial languages type schemes will always be the +the *polytypes*. In our simple initial languages type schemes will always be the representation of top level signature, even if there are no polymorphic type variables. In implementation terms this means when a monotype is yielded from our Infer monad after inference, we will immediately generalize it at the @@ -209,7 +208,7 @@ Context The typing context or environment is the central container around which all information during the inference process is stored and queried. In Haskell our -implementation will simply be a newtype wrapper around a Map` of ``Var`` to +implementation will simply be a newtype wrapper around a Map of ``Var`` to ``Scheme`` types. ```haskell @@ -238,7 +237,7 @@ extend (TypeEnv env) (x, s) = TypeEnv $ Map.insert x s env Inference Monad --------------- -All our logic for type inference will live inside of the ``Infer`` monad. Which +All our logic for type inference will live inside of the ``Infer`` monad. It is a monad transformer stack of ``ExcpetT`` + ``State``, allowing various error reporting and statefully holding the fresh name supply. @@ -270,7 +269,7 @@ $$ \end{aligned} $$ -Likewise the same pattern applies for type variables at the type level. +The same pattern applies to type variables at the type level. $$ \begin{aligned} @@ -291,7 +290,7 @@ $$ {[x / e'] x} &= e' \\ [x / e'] y &= y \quad (y \ne x) \\ [x / e'] (e_1 e_2) &= ([x / e'] \ e_1) ([x / e'] e_2) \\ -[x / e'] (\lambda y. e_1) &= \lambda y. [x / e']e \quad y \ne x, x \notin \FV{e'} \\ +[x / e'] (\lambda y. e_1) &= \lambda y. [x / e']e \quad y \ne x, y \notin \FV{e'} \\ \end{aligned} $$ @@ -324,7 +323,7 @@ s1 `compose` s2 = Map.map (apply s1) s2 `Map.union` s1 ``` The implementation in Haskell is via a series of implementations of a -``Substitutable`` typeclass which exposes the ``apply`` which applies the +``Substitutable`` typeclass which exposes an ``apply`` function which applies the substitution given over the structure of the type replacing type variables as specified. @@ -358,7 +357,7 @@ instance Substitutable TypeEnv where Throughout both the typing rules and substitutions we will require a fresh supply of names. In this naive version we will simply use an infinite list of -strings and slice into n'th element of list per a index that we hold in a State +strings and slice into n'th element of list per an index that we hold in a State monad. This is a simplest implementation possible, and later we will adapt this name generation technique to be more robust. @@ -374,14 +373,14 @@ fresh = do ``` The creation of fresh variables will be essential for implementing the inference -rules. Whenever we encounter the first use a variable within some expression we -will create a fresh type variable. +rules. Whenever we encounter the first use of a variable within some expression +we will create a fresh type variable. Unification ----------- -Central to the idea of inference is the notion *unification*. A unifier is a -function $s$ for two expressions $e_1$ and $e_2$ is a relation such that: +Central to the idea of inference is the notion of *unification*. A unifier +for two expressions $e_1$ and $e_2$ is a substitution $s$ such that: $$ s := [n_0 / m_0, n_1 / m_1, ..., n_k / m_k] \\ @@ -393,14 +392,14 @@ between them. A substitution set is said to be *confluent* if the application of substitutions is independent of the order applied, i.e. if we always arrive at the same normal form regardless of the order of substitution chosen. -The notation we'll adopt for unification is, read as two types $\tau, \tau'$ are -unifiable by a substitution $s$. +We'll adopt the notation $$ \tau \sim \tau' : s $$ -Such that: +for the fact that two types $\tau, \tau'$ are unifiable by a substitution $s$, +such that: $$ [s] \tau = [s] \tau' @@ -437,20 +436,33 @@ $$ \end{array} $$ -There is a precondition for unifying two variables known as the *occurs check* -which asserts that if we are applying a substitution of variable $x$ to an -expression $e$, the variable $x$ cannot be free in $e$. Otherwise the rewrite -would diverge, constantly rewriting itself. For example the following -substitution would not pass the occurs check and would generate an infinitely -long function type. +If we want to unify a type variable $\alpha$ with a type $\tau$, we usually +can just substitute the variable with the type: $[\alpha/\tau]$. +However, our rules state a precondition known as the *occurs check* +for that unification: the type variable $\alpha$ must not occur free in $\tau$. +If it did, the substitution would not be a unifier. -$$ -[x/x \rightarrow x] -$$ +Take for example the problem of unifying $\alpha$ and $\alpha\rightarrow\beta$. +The substitution $s=[\alpha/\alpha\rightarrow\beta]$ doesn't unify: +we get +$$[s]\alpha=\alpha\rightarrow\beta$$ +and +$$[s]\alpha\rightarrow\beta=(\alpha\rightarrow\beta)\rightarrow\beta.$$ -The occurs check will forbid the existence of many of the pathological livering -terms we discussed when covering the untyped lambda calculus, including the -omega combinator. +Indeed, whatever substitution $s$ we try, $[s]\alpha\rightarrow\beta$ will +always be longer than $[s]\alpha$, so no unifier exists. +The only chance would be to substitute with an infinite type: +$[\alpha/(\dots((\alpha\rightarrow\beta)\rightarrow\beta)\rightarrow\dots + \rightarrow\beta)\rightarrow\beta]$ +would be a unifier, but our language has no such types. + +If the unification fails because of the occurs check, we say that unification +would give an infinite type. + +Note that unifying $\alpha\rightarrow\beta$ and $\alpha$ is exactly what +we would have to do if we tried to type check the omega combinator +$\lambda x.x x$, so it is ruled out by the occurs check, as are other +pathological terms we discussed when covering the untyped lambda calculus. ```haskell occursCheck :: Substitutable a => TVar -> a -> Bool @@ -948,31 +960,34 @@ This a much more elegant solution than having to intermingle inference and solving in the same pass, and adapts itself well to the generation of a typed Core form which we will discuss in later chapters. -Worked Example --------------- +Worked Examples +--------------- + +Let's walk through two examples of how inference works for simple +functions. **Example 1** -Let's walk through a few examples of how inference works for a few simple -functions. Consider: +Consider: ```haskell \x y z -> x + y + z ``` -The generated type from the ``infer`` function is simply a fresh variable for -each of the arguments and return type. This is completely unconstrained. +The generated type from the ``infer`` function consists simply of a fresh +variable for each of the arguments and the return type. ```haskell a -> b -> c -> e ``` The constraints induced from **T-BinOp** are emitted as we traverse both of -addition operations. +the addition operations. 1. ``a -> b -> d ~ Int -> Int -> Int`` 2. ``d -> c -> e ~ Int -> Int -> Int`` +Here ``d`` is the type of the intermediate term ``x + y``. By applying **Uni-Arrow** we can then deduce the following set of substitutions. 1. ``a ~ Int`` @@ -993,29 +1008,32 @@ Int -> Int -> Int -> Int compose f g x = f (g x) ``` -The generated type from the ``infer`` function is again simply a set of unique +The generated type from the ``infer`` function consists again simply of unique fresh variables. ```haskell a -> b -> c -> e ``` -Induced two cases of the **T-App** we get the following constraints. +Induced by two cases of the **T-App** rule we get the following constraints: 1. ``b ~ c -> d`` 2. ``a ~ d -> e`` -These are already in a canonical form, but applying **Uni-VarLeft** twice we get -the following trivial set of substitutions. +Here ``d`` is the type of ``(g x)``. +The constraints are already in a canonical form, by applying **Uni-VarLeft** +twice we get the following set of substitutions: 1. ``b ~ c -> d`` 2. ``a ~ d -> e`` +So we get this type: + ```haskell compose :: forall c d e. (d -> e) -> (c -> d) -> c -> e ``` -If desired, you can rearrange the variables in alphabetical order to get: +If desired, we can rename the variables in alphabetical order to get: ```haskell compose :: forall a b c. (a -> b) -> (c -> a) -> c -> b diff --git a/007_path.md b/007_path.md index c7734b9..9e2c15d 100644 --- a/007_path.md +++ b/007_path.md @@ -27,14 +27,14 @@ Language Chapters Description *ProtoHaskell* 8 - 18 Interpreted minimal Haskell subset. *Fun* 18 - 27 ProtoHaskell with native code generator. -The defining feature of ProtoHaskell is that is independent of an evaluation -model, so hypothetically one could write either a lazy or strict backend and use +The defining feature of ProtoHaskell is that it is independent of an evaluation +model, so hypothetically one could write either a lazy or a strict backend and use the same frontend. Before we launch into writing compiler passes let's look at the overview of where we're going, the scope of what we're going to do, and what needs to be done to get there. *We will refer to concepts that are not yet introduced, so -keep is meant to be referred to as a high-level overview of the ProtoHaskell +keep in mind this is meant to be a high-level overview of the ProtoHaskell compiler pipeline.* Haskell: A Rich Language @@ -119,7 +119,7 @@ Things we will not implement are: * Foreign Function Interface Now if one feels so inclined one could of course implement these features on top -our final language, but they are left as an exercise to the reader! +of our final language, but they are left as an exercise to the reader! This of course begs the question of whether or not our language is "a Haskell". In the strictest sense, it will not be since it doesn't fully conform to either @@ -277,7 +277,7 @@ Compiling module: prelude.fun λ> :load test.fun ``` -Command line conventions will follow the Haskell's naming conventions. There +Command line conventions will follow GHCi's naming conventions. There will be a strong emphasis on building debugging systems on top of our architecture so that when subtle bugs creep up you will have the tools to diagnose the internal state of the type system and detect flaws in the @@ -302,9 +302,9 @@ Command Action The most notable difference is the very important ``:core`` command which will dump out the core representation of any expression given in the interactive -shell. Also the ``:constraints`` which will interactively walk you through the -type checker's reasoning about it how derived the type it did for a given -expression. +shell. Another one is the ``:constraints`` command which will interactively +walk you through the type checker's reasoning about how it derived the type +it did for a given expression. ```haskell ProtoHaskell> :type plus @@ -348,7 +348,7 @@ Parser ------ We will use the normal Parsec parser with a few extensions. We will add -indentation sensitive parser so that block syntax ( where statements, let +indentation sensitive parsing so that block syntax ( where statements, let statements, do-notation ) can be parsed. ```haskell @@ -359,7 +359,7 @@ main = do msg = "Hello World" ``` -In addition we will need to allow for the addition of infix operators from be +We will also need to allow the addition of infix operators from user-defined declarations, and allow this information to be used during parsing. ```haskell @@ -383,7 +383,7 @@ f x y = \g a0 -> a0 + y ``` We will also devise a general method of generating fresh names for each pass -such that the names generated are uniquely identifiable to that pass and cannot +such that the names generated are uniquely relatable to that pass and cannot conflict with later passes. Ensuring that all names are unique in the syntax tree will allow us more safety @@ -412,11 +412,11 @@ Desugaring ---------- Pattern matching is an extremely important part of a modern functional -programming, but the implementation of the pattern desugaring is remarkably +programming language, but the implementation of the pattern desugaring is remarkably subtle. The frontend syntax allows the expression of nested pattern matches and incomplete patterns, both can generate very complex *splitting trees* of case expressions that need to be expanded out recursively. We will use the algorithm -devised Phil Wadler to perform this transformation. +devised by Phil Wadler to perform this transformation. **Multiple Equations** @@ -504,7 +504,7 @@ syntactic sugar translations: * Expand out numeric literals. We will however punt on an important part of the Haskell specification, namely -*overloaded literals*. In GHC Haskell numeric literals are replaced by specific +*overloaded literals*. In Haskell numeric literals are replaced by specific functions from the ``Num`` or ``Fractional`` typeclasses. ```haskell @@ -519,8 +519,8 @@ fromRational (3.14 :: Rational) We will not implement this, as it drastically expands the desugarer scope. -We will however follow GHC's example in manifesting unboxed types are first -class values in the language so literals that they appear in the AST rewritten +We will however follow GHC's example in manifesting unboxed types as first +class values in the language so literals that appear in the AST are rewritten in terms of the wired-in constructors (``Int#``, ``Char#``, ``Addr#``, etc). ```haskell @@ -541,12 +541,12 @@ Core ---- The Core language is the result of translation of the frontend language into an -explicitly typed form. Just like Haskell we will use a System-F variant, +explicitly typed form. Just like GHC we will use a System-F variant, although unlike GHC we will effectively just be using vanilla System-F without all of the extensions ( coercions, equalities, roles, etc ) that GHC uses to implement more complicated features like GADTs and type families. -This is one of the most defining feature of GHC Haskell, is its compilation +This is one of the most defining feature of GHC Haskell, its compilation into a statically typed intermediate Core language. It is a well-engineers detail of GHC's design that has informed much of how Haskell the language has evolved as a language with a exceedingly large frontend language that all melts @@ -586,8 +586,8 @@ data Kind ``` Since the Core language is explicitly typed, it is trivial to implement an -internal type checker for. Running the typechecker on the generated core is a -good way to catch optimization, desugaring bugs, and determine if the compiler +internal type checker for it. Running the typechecker on the generated core is a +good way to catch optimization and desugaring bugs, and determine if the compiler has produced invalid intermediate code.