mirror of
https://github.com/edwinb/Idris2-boot.git
synced 2024-11-28 05:32:03 +03:00
293 lines
9.3 KiB
ReStructuredText
293 lines
9.3 KiB
ReStructuredText
.. _sect-interp:
|
||
|
||
***********************************
|
||
Example: The Well-Typed Interpreter
|
||
***********************************
|
||
|
||
In this section, we’ll use the features we’ve seen so far to write a
|
||
larger example, an interpreter for a simple functional programming
|
||
language, with variables, function application, binary operators and
|
||
an ``if...then...else`` construct. We will use the dependent type
|
||
system to ensure that any programs which can be represented are
|
||
well-typed.
|
||
|
||
Representing Languages
|
||
======================
|
||
|
||
First, let us define the types in the language. We have integers,
|
||
booleans, and functions, represented by ``Ty``:
|
||
|
||
.. code-block:: idris
|
||
|
||
data Ty = TyInt | TyBool | TyFun Ty Ty
|
||
|
||
We can write a function to translate these representations to a concrete
|
||
Idris type — remember that types are first class, so can be
|
||
calculated just like any other value:
|
||
|
||
.. code-block:: idris
|
||
|
||
interpTy : Ty -> Type
|
||
interpTy TyInt = Integer
|
||
interpTy TyBool = Bool
|
||
interpTy (TyFun a t) = interpTy a -> interpTy t
|
||
|
||
We're going to define a representation of our language in such a way
|
||
that only well-typed programs can be represented. We'll index the
|
||
representations of expressions by their type, **and** the types of
|
||
local variables (the context). The context can be represented using
|
||
the ``Vect`` data type, so we'll need to import ``Data.Vect`` at the top of our
|
||
source file:
|
||
|
||
.. code-block:: idris
|
||
|
||
import Data.Vect
|
||
|
||
Expressions are indexed by the types of the local
|
||
variables, and the type of the expression itself:
|
||
|
||
.. code-block:: idris
|
||
|
||
data Expr : Vect n Ty -> Ty -> Type
|
||
|
||
The full representation of expressions is:
|
||
|
||
.. code-block:: idris
|
||
|
||
data HasType : (i : Fin n) -> Vect n Ty -> Ty -> Type where
|
||
Stop : HasType FZ (t :: ctxt) t
|
||
Pop : HasType k ctxt t -> HasType (FS k) (u :: ctxt) t
|
||
|
||
data Expr : Vect n Ty -> Ty -> Type where
|
||
Var : HasType i ctxt t -> Expr ctxt t
|
||
Val : (x : Integer) -> Expr ctxt TyInt
|
||
Lam : Expr (a :: ctxt) t -> Expr ctxt (TyFun a t)
|
||
App : Expr ctxt (TyFun a t) -> Expr ctxt a -> Expr ctxt t
|
||
Op : (interpTy a -> interpTy b -> interpTy c) ->
|
||
Expr ctxt a -> Expr ctxt b -> Expr ctxt c
|
||
If : Expr ctxt TyBool ->
|
||
Lazy (Expr ctxt a) ->
|
||
Lazy (Expr ctxt a) -> Expr ctxt a
|
||
|
||
The code above makes use of the ``Vect`` and ``Fin`` types from the
|
||
base libraries. ``Fin`` is available as part of ``Data.Vect``. Throughout,
|
||
``ctxt`` refers to the local variable context.
|
||
|
||
Since expressions are indexed by their type, we can read the typing
|
||
rules of the language from the definitions of the constructors. Let us
|
||
look at each constructor in turn.
|
||
|
||
We use a nameless representation for variables — they are *de Bruijn
|
||
indexed*. Variables are represented by a proof of their membership in
|
||
the context, ``HasType i ctxt T``, which is a proof that variable ``i``
|
||
in context ``ctxt`` has type ``T``. This is defined as follows:
|
||
|
||
.. code-block:: idris
|
||
|
||
data HasType : (i : Fin n) -> Vect n Ty -> Ty -> Type where
|
||
Stop : HasType FZ (t :: ctxt) t
|
||
Pop : HasType k ctxt t -> HasType (FS k) (u :: ctxt) t
|
||
|
||
We can treat *Stop* as a proof that the most recently defined variable
|
||
is well-typed, and *Pop n* as a proof that, if the ``n``\ th most
|
||
recently defined variable is well-typed, so is the ``n+1``\ th. In
|
||
practice, this means we use ``Stop`` to refer to the most recently
|
||
defined variable, ``Pop Stop`` to refer to the next, and so on, via
|
||
the ``Var`` constructor:
|
||
|
||
.. code-block:: idris
|
||
|
||
Var : HasType i ctxt t -> Expr ctxt t
|
||
|
||
So, in an expression ``\x. \y. x y``, the variable ``x`` would have a
|
||
de Bruijn index of 1, represented as ``Pop Stop``, and ``y 0``,
|
||
represented as ``Stop``. We find these by counting the number of
|
||
lambdas between the definition and the use.
|
||
|
||
A value carries a concrete representation of an integer:
|
||
|
||
.. code-block:: idris
|
||
|
||
Val : (x : Integer) -> Expr ctxt TyInt
|
||
|
||
A lambda creates a function. In the scope of a function of type ``a ->
|
||
t``, there is a new local variable of type ``a``, which is expressed
|
||
by the context index:
|
||
|
||
.. code-block:: idris
|
||
|
||
Lam : Expr (a :: ctxt) t -> Expr ctxt (TyFun a t)
|
||
|
||
Function application produces a value of type ``t`` given a function
|
||
from ``a`` to ``t`` and a value of type ``a``:
|
||
|
||
.. code-block:: idris
|
||
|
||
App : Expr ctxt (TyFun a t) -> Expr ctxt a -> Expr ctxt t
|
||
|
||
We allow arbitrary binary operators, where the type of the operator
|
||
informs what the types of the arguments must be:
|
||
|
||
.. code-block:: idris
|
||
|
||
Op : (interpTy a -> interpTy b -> interpTy c) ->
|
||
Expr ctxt a -> Expr ctxt b -> Expr ctxt c
|
||
|
||
Finally, ``If`` expressions make a choice given a boolean. Each branch
|
||
must have the same type, and we will evaluate the branches lazily so
|
||
that only the branch which is taken need be evaluated:
|
||
|
||
.. code-block:: idris
|
||
|
||
If : Expr ctxt TyBool ->
|
||
Lazy (Expr ctxt a) ->
|
||
Lazy (Expr ctxt a) ->
|
||
Expr ctxt a
|
||
|
||
Writing the Interpreter
|
||
=======================
|
||
|
||
When we evaluate an ``Expr``, we'll need to know the values in scope,
|
||
as well as their types. ``Env`` is an environment, indexed over the
|
||
types in scope. Since an environment is just another form of list,
|
||
albeit with a strongly specified connection to the vector of local
|
||
variable types, we use the usual ``::`` and ``Nil`` constructors so
|
||
that we can use the usual list syntax. Given a proof that a variable
|
||
is defined in the context, we can then produce a value from the
|
||
environment:
|
||
|
||
.. code-block:: idris
|
||
|
||
data Env : Vect n Ty -> Type where
|
||
Nil : Env Nil
|
||
(::) : interpTy a -> Env ctxt -> Env (a :: ctxt)
|
||
|
||
lookup : HasType i ctxt t -> Env ctxt -> interpTy t
|
||
lookup Stop (x :: xs) = x
|
||
lookup (Pop k) (x :: xs) = lookup k xs
|
||
|
||
Given this, an interpreter is a function which
|
||
translates an ``Expr`` into a concrete Idris value with respect to a
|
||
specific environment:
|
||
|
||
.. code-block:: idris
|
||
|
||
interp : Env ctxt -> Expr ctxt t -> interpTy t
|
||
|
||
The complete interpreter is defined as follows, for reference. For
|
||
each constructor, we translate it into the corresponding Idris value:
|
||
|
||
.. code-block:: idris
|
||
|
||
interp env (Var i) = lookup i env
|
||
interp env (Val x) = x
|
||
interp env (Lam sc) = \x => interp (x :: env) sc
|
||
interp env (App f s) = interp env f (interp env s)
|
||
interp env (Op op x y) = op (interp env x) (interp env y)
|
||
interp env (If x t e) = if interp env x then interp env t
|
||
else interp env e
|
||
|
||
Let us look at each case in turn. To translate a variable, we simply look it
|
||
up in the environment:
|
||
|
||
.. code-block:: idris
|
||
|
||
interp env (Var i) = lookup i env
|
||
|
||
To translate a value, we just return the concrete representation of the
|
||
value:
|
||
|
||
.. code-block:: idris
|
||
|
||
interp env (Val x) = x
|
||
|
||
Lambdas are more interesting. In this case, we construct a function
|
||
which interprets the scope of the lambda with a new value in the
|
||
environment. So, a function in the object language is translated to an
|
||
Idris function:
|
||
|
||
.. code-block:: idris
|
||
|
||
interp env (Lam sc) = \x => interp (x :: env) sc
|
||
|
||
For an application, we interpret the function and its argument and apply
|
||
it directly. We know that interpreting ``f`` must produce a function,
|
||
because of its type:
|
||
|
||
.. code-block:: idris
|
||
|
||
interp env (App f s) = interp env f (interp env s)
|
||
|
||
Operators and conditionals are, again, direct translations into the
|
||
equivalent Idris constructs. For operators, we apply the function to
|
||
its operands directly, and for ``If``, we apply the Idris
|
||
``if...then...else`` construct directly.
|
||
|
||
.. code-block:: idris
|
||
|
||
interp env (Op op x y) = op (interp env x) (interp env y)
|
||
interp env (If x t e) = if interp env x then interp env t
|
||
else interp env e
|
||
|
||
Testing
|
||
=======
|
||
|
||
We can make some simple test functions. Firstly, adding two inputs
|
||
``\x. \y. y + x`` is written as follows:
|
||
|
||
.. code-block:: idris
|
||
|
||
add : Expr ctxt (TyFun TyInt (TyFun TyInt TyInt))
|
||
add = Lam (Lam (Op (+) (Var Stop) (Var (Pop Stop))))
|
||
|
||
More interestingly, a factorial function ``fact``
|
||
(e.g. ``\x. if (x == 0) then 1 else (fact (x-1) * x)``),
|
||
can be written as:
|
||
|
||
.. code-block:: idris
|
||
|
||
fact : Expr ctxt (TyFun TyInt TyInt)
|
||
fact = Lam (If (Op (==) (Var Stop) (Val 0))
|
||
(Val 1)
|
||
(Op (*) (App fact (Op (-) (Var Stop) (Val 1)))
|
||
(Var Stop)))
|
||
|
||
Running
|
||
=======
|
||
|
||
To finish, we write a ``main`` program which interprets the factorial
|
||
function on user input:
|
||
|
||
.. code-block:: idris
|
||
|
||
main : IO ()
|
||
main = do putStr "Enter a number: "
|
||
x <- getLine
|
||
printLn (interp [] fact (cast x))
|
||
|
||
Here, ``cast`` is an overloaded function which converts a value from
|
||
one type to another if possible. Here, it converts a string to an
|
||
integer, giving 0 if the input is invalid. An example run of this
|
||
program at the Idris interactive environment is:
|
||
|
||
.. _factrun:
|
||
.. literalinclude:: ../listing/idris-prompt-interp.txt
|
||
|
||
|
||
Aside: ``cast``
|
||
---------------
|
||
|
||
The prelude defines an interface ``Cast`` which allows conversion
|
||
between types:
|
||
|
||
.. code-block:: idris
|
||
|
||
interface Cast from to where
|
||
cast : from -> to
|
||
|
||
It is a *multi-parameter* interface, defining the source type and
|
||
object type of the cast. It must be possible for the type checker to
|
||
infer *both* parameters at the point where the cast is applied. There
|
||
are casts defined between all of the primitive types, as far as they
|
||
make sense.
|