Idris2-boot/docs/tutorial/interp.rst

.. _sect-interp:

***********************************
Example: The Well-Typed Interpreter
***********************************

In this section, we’ll use the features we’ve seen so far to write a
larger example, an interpreter for a simple functional programming
language, with variables, function application, binary operators and
an ``if...then...else`` construct. We will use the dependent type
system to ensure that any programs which can be represented are
well-typed.

Representing Languages
======================

First, let us define the types in the language. We have integers,
booleans, and functions, represented by ``Ty``:

.. code-block:: idris

    data Ty = TyInt | TyBool | TyFun Ty Ty

We can write a function to translate these representations to a concrete
Idris type — remember that types are first class, so can be
calculated just like any other value:

.. code-block:: idris

    interpTy : Ty -> Type
    interpTy TyInt       = Integer
    interpTy TyBool      = Bool
    interpTy (TyFun a t) = interpTy a -> interpTy t

We're going to define a representation of our language in such a way
that only well-typed programs can be represented. We'll index the
representations of expressions by their type, **and** the types of
local variables (the context). The context can be represented using
the ``Vect`` data type, so we'll need to import ``Data.Vect`` at the top of our
source file:

.. code-block:: idris

    import Data.Vect

Expressions are indexed by the types of the local
variables, and the type of the expression itself:

.. code-block:: idris

    data Expr : Vect n Ty -> Ty -> Type

The full representation of expressions is:

.. code-block:: idris

    data HasType : (i : Fin n) -> Vect n Ty -> Ty -> Type where
        Stop : HasType FZ (t :: ctxt) t
        Pop  : HasType k ctxt t -> HasType (FS k) (u :: ctxt) t

    data Expr : Vect n Ty -> Ty -> Type where
        Var : HasType i ctxt t -> Expr ctxt t
        Val : (x : Integer) -> Expr ctxt TyInt
        Lam : Expr (a :: ctxt) t -> Expr ctxt (TyFun a t)
        App : Expr ctxt (TyFun a t) -> Expr ctxt a -> Expr ctxt t
        Op  : (interpTy a -> interpTy b -> interpTy c) ->
              Expr ctxt a -> Expr ctxt b -> Expr ctxt c
        If  : Expr ctxt TyBool ->
              Lazy (Expr ctxt a) ->
              Lazy (Expr ctxt a) -> Expr ctxt a

The code above makes use of the ``Vect`` and ``Fin`` types from the
base libraries. ``Fin`` is available as part of ``Data.Vect``. Throughout,
``ctxt`` refers to the local variable context.

Since expressions are indexed by their type, we can read the typing
rules of the language from the definitions of the constructors. Let us
look at each constructor in turn.

We use a nameless representation for variables — they are *de Bruijn
indexed*. Variables are represented by a proof of their membership in
the context, ``HasType i ctxt T``, which is a proof that variable ``i``
in context ``ctxt`` has type ``T``. This is defined as follows:

.. code-block:: idris

    data HasType : (i : Fin n) -> Vect n Ty -> Ty -> Type where
        Stop : HasType FZ (t :: ctxt) t
        Pop  : HasType k ctxt t -> HasType (FS k) (u :: ctxt) t

We can treat *Stop* as a proof that the most recently defined variable
is well-typed, and *Pop n* as a proof that, if the ``n``\ th most
recently defined variable is well-typed, so is the ``n+1``\ th. In
practice, this means we use ``Stop`` to refer to the most recently
defined variable, ``Pop Stop`` to refer to the next, and so on, via
the ``Var`` constructor:

.. code-block:: idris

    Var : HasType i ctxt t -> Expr ctxt t

So, in an expression ``\x. \y. x y``, the variable ``x`` would have a
de Bruijn index of 1, represented as ``Pop Stop``, and ``y 0``,
represented as ``Stop``. We find these by counting the number of
lambdas between the definition and the use.

A value carries a concrete representation of an integer:

.. code-block:: idris

    Val : (x : Integer) -> Expr ctxt TyInt

A lambda creates a function. In the scope of a function of type ``a ->
t``, there is a new local variable of type ``a``, which is expressed
by the context index:

.. code-block:: idris

    Lam : Expr (a :: ctxt) t -> Expr ctxt (TyFun a t)

Function application produces a value of type ``t`` given a function
from ``a`` to ``t`` and a value of type ``a``:

.. code-block:: idris

    App : Expr ctxt (TyFun a t) -> Expr ctxt a -> Expr ctxt t

We allow arbitrary binary operators, where the type of the operator
informs what the types of the arguments must be:

.. code-block:: idris

    Op : (interpTy a -> interpTy b -> interpTy c) ->
         Expr ctxt a -> Expr ctxt b -> Expr ctxt c

Finally, ``If`` expressions make a choice given a boolean. Each branch
must have the same type, and we will evaluate the branches lazily so
that only the branch which is taken need be evaluated:

.. code-block:: idris

    If : Expr ctxt TyBool ->
         Lazy (Expr ctxt a) ->
         Lazy (Expr ctxt a) ->
         Expr ctxt a

Writing the Interpreter
=======================

When we evaluate an ``Expr``, we'll need to know the values in scope,
as well as their types. ``Env`` is an environment, indexed over the
types in scope. Since an environment is just another form of list,
albeit with a strongly specified connection to the vector of local
variable types, we use the usual ``::`` and ``Nil`` constructors so
that we can use the usual list syntax. Given a proof that a variable
is defined in the context, we can then produce a value from the
environment:

.. code-block:: idris

    data Env : Vect n Ty -> Type where
        Nil  : Env Nil
        (::) : interpTy a -> Env ctxt -> Env (a :: ctxt)

    lookup : HasType i ctxt t -> Env ctxt -> interpTy t
    lookup Stop    (x :: xs) = x
    lookup (Pop k) (x :: xs) = lookup k xs

Given this, an interpreter is a function which
translates an ``Expr`` into a concrete Idris value with respect to a
specific environment:

.. code-block:: idris

    interp : Env ctxt -> Expr ctxt t -> interpTy t

The complete interpreter is defined as follows, for reference. For
each constructor, we translate it into the corresponding Idris value:

.. code-block:: idris

    interp env (Var i)     = lookup i env
    interp env (Val x)     = x
    interp env (Lam sc)    = \x => interp (x :: env) sc
    interp env (App f s)   = interp env f (interp env s)
    interp env (Op op x y) = op (interp env x) (interp env y)
    interp env (If x t e)  = if interp env x then interp env t
                                             else interp env e

Let us look at each case in turn. To translate a variable, we simply look it
up in the environment:

.. code-block:: idris

    interp env (Var i) = lookup i env

To translate a value, we just return the concrete representation of the
value:

.. code-block:: idris

    interp env (Val x) = x

Lambdas are more interesting. In this case, we construct a function
which interprets the scope of the lambda with a new value in the
environment. So, a function in the object language is translated to an
Idris function:

.. code-block:: idris

    interp env (Lam sc) = \x => interp (x :: env) sc

For an application, we interpret the function and its argument and apply
it directly. We know that interpreting ``f`` must produce a function,
because of its type:

.. code-block:: idris

    interp env (App f s) = interp env f (interp env s)

Operators and conditionals are, again, direct translations into the
equivalent Idris constructs. For operators, we apply the function to
its operands directly, and for ``If``, we apply the Idris
``if...then...else`` construct directly.

.. code-block:: idris

    interp env (Op op x y) = op (interp env x) (interp env y)
    interp env (If x t e)  = if interp env x then interp env t
                                             else interp env e

Testing
=======

We can make some simple test functions. Firstly, adding two inputs
``\x. \y. y + x`` is written as follows:

.. code-block:: idris

    add : Expr ctxt (TyFun TyInt (TyFun TyInt TyInt))
    add = Lam (Lam (Op (+) (Var Stop) (Var (Pop Stop))))

More interestingly, a factorial function ``fact``
(e.g. ``\x. if (x == 0) then 1 else (fact (x-1) * x)``),
can be written as:

.. code-block:: idris

    fact : Expr ctxt (TyFun TyInt TyInt)
    fact = Lam (If (Op (==) (Var Stop) (Val 0))
                   (Val 1)
                   (Op (*) (App fact (Op (-) (Var Stop) (Val 1)))
                           (Var Stop)))

Running
=======

To finish, we write a ``main`` program which interprets the factorial
function on user input:

.. code-block:: idris

    main : IO ()
    main = do putStr "Enter a number: "
              x <- getLine
              printLn (interp [] fact (cast x))

Here, ``cast`` is an overloaded function which converts a value from
one type to another if possible. Here, it converts a string to an
integer, giving 0 if the input is invalid. An example run of this
program at the Idris interactive environment is:

.. _factrun:
.. literalinclude:: ../listing/idris-prompt-interp.txt


Aside: ``cast``
---------------

The prelude defines an interface ``Cast`` which allows conversion
between types:

.. code-block:: idris

    interface Cast from to where
        cast : from -> to

It is a *multi-parameter* interface, defining the source type and
object type of the cast. It must be possible for the type checker to
infer *both* parameters at the point where the cast is applied. There
are casts defined between all of the primitive types, as far as they
make sense.