Document changes since Idris 1 in more detail

This commit is contained in:
Edwin Brady 2020-02-26 21:46:03 +00:00
parent e340747b1f
commit 99a133c582
2 changed files with 523 additions and 25 deletions

View File

@ -3,11 +3,13 @@
Type Driven Development with Idris: Updates Required
====================================================
The code in the book "Type Driven Development with Idris" by Edwin Brady,
published by Manning, will mostly work in Idris 2, with some small changes
as detailed in this document. The updated code is also [going to be] part
of the test suite (see `tests/typedd-book <https://github.com/edwinb/Idris2/tree/master/tests/typedd-book>`_
in the Idris 2 source).
The code in the book `Type-Driven Development with Idris
<https://www.manning.com/books/type-driven-development-with-idris>`_ by Edwin
Brady, available from `Manning <https://www.manning.com>`_, will mostly work
in Idris 2, with some small changes as detailed in this document. The updated
code is also [going to be] part of the test suite (see `tests/typedd-book
<https://github.com/edwinb/Idris2/tree/master/tests/typedd-book>`_ in the Idris
2 source).
Chapter 1
---------

View File

@ -9,6 +9,10 @@ exceptions. This document describes the changes, approximately in order of
likelihood of encountering them in practice. New features are described at
the end, in Section :ref:`sect-new-features`.
The Section :ref:`typedd-index` describes how these changes affect the code
in the book `Type-Driven Development with Idris <https://www.manning.com/books/type-driven-development-with-idris>`_
by Edwin Brady, available from `Manning <https://www.manning.com>`_.
[NOT YET COMPLETE]
.. note::
@ -22,16 +26,223 @@ the end, in Section :ref:`sect-new-features`.
New Core Language: Quantities in Types
======================================
Idris 2 is based on `Quantitative Type Theory (QTT)
<https://bentnib.org/quantitative-type-theory.html>`_, a core language
developed by Bob Atkey and Conor McBride. In practice, this means that every
variable in Idris 2 has a *quantity* associated with it. A quantity is one of:
* ``0``, meaning that the variable is *erased* at run time
* ``1``, meaning that the variable is used *exactly once* at run time
* *Unrestricted*, which is the same behaviour as Idris 1
For more details on this, see Section :ref:`sect-multiplicities`. In practice,
this might cause some Idris 1 programs not to type check in Idris 2 due to
attempting to use an argument which is erased at run time.
Erasure
-------
In Idris, names which begin with a lower case later are automatically bound
as implicit arguments in types, for example in the following skeleton
definition, ``n``, ``a`` and ``m`` are implicitly bound:
.. code-block:: idris
append : Vect n a -> Vect m a -> Vect (n + m) a
append xs ys = ?append_rhs
One of the difficulties in compiling a dependently typed programming language
is deciding which arguments are used at run time and which can safely be
erased. More importantly, it's one of the difficulties when programming: how
can a programmer *know* when an argument will be erased?
In Idris 2, a variable's quantity tells us whether it will be available at
run time or not. We can see the quantities of the variables in scope in
``append_rhs`` by inspecting the hole at the REPL:
::
Main> :t append_rhs
0 m : Nat
0 a : Type
0 n : Nat
ys : Vect m a
xs : Vect n a
-------------------------------------
append_rhs : Vect (plus n m) a
The ``0`` next to ``m``, ``a`` and ``n`` mean that they are in scope, but there
will be ``0`` occurrences at run-time. That is, it is *guaranteed* that they
will be erased at run-time.
This does lead to some potential difficulties when converting Idris 1 programs,
if you are using implicit arguments at run time. For example, in Idris 1 you
can get the length of a vector as follows:
.. code-block:: idris
vlen : Vect n a -> Nat
vlen {n} xs = n
This might seem like a good idea, since it runs in constant time and takes
advantage of the type level information, but the trade off is that ``n`` has to
be available at run time, so at run time we always need the length of the
vector to be available if we ever call ``vlen``. Idris 1 can infer whether the
length is needed, but there's no easy way for a programmer to be sure.
In Idris 2, we need to state explicitly that ``n`` is needed at run time
.. code-block:: idris
vlen : {n : Nat} -> Vect n a -> Nat
vlen xs = n
(Incidentally, also note that in Idris 2, names bound in types are also available
in the definition without explicitly rebinding them.)
This also means that when you call ``vlen``, you need the length available. For
example, this will give an error
.. code-block:: idris
sumLengths : Vect m a -> Vect n a —> Nat
sumLengths xs ys = vlen xs + vlen ys
Idris 2 reports::
vlen.idr:7:20--7:28:While processing right hand side of Main.sumLengths at vlen.idr:7:1--10:1:
m is not accessible in this context
This means that it needs to use ``m`` as an argument to pass to ``vlen xs``,
where it needs to be available at run time, but ``m`` is not available in
``sumLengths`` because it has multiplicity ``0``.
We can see this more clearly by replacing the right hand side of
``sumLengths`` with a hole...
.. code-block:: idris
sumLengths : Vect m a -> Vect n a -> Nat
sumLengths xs ys = ?sumLengths_rhs
...then checking the hole's type at the REPL::
Main> :t sumLengths_rhs
0 n : Nat
0 a : Type
0 m : Nat
ys : Vect n a
xs : Vect m a
-------------------------------------
sumLengths_rhs : Nat
Instead, we need to give bindings for ``m`` and ``n`` with
unrestricted multiplicity
.. code-block:: idris
sumLengths : {m, n : _} -> Vect m a -> Vect n a —> Nat
sumLengths xs ys = vLen xs + vLen xs
Remember that giving no multiplicity on a binder, as with ``m`` and ``n`` here,
means that the variable has unrestricted usage.
If you're converting Idris 1 programs to work with Idris 2, this is probably the
biggest thing you need to think about. It is important to note,
though, that if you have bound implicits, such as...
.. code-block:: idris
excitingFn : {t : _} -> Coffee t -> Moonbase t
...then it's a good idea to make sure ``t`` really is needed, or performance
might suffer due to the run time building the instance of ``t`` unnecessarily!
One final note on erasure: it is an error to try to pattern match on an
argument with multiplicity ``0``, unless its value is inferrable from
elsewhere. So, the following definition is rejected
.. code-block:: idris
badNot : (0 x : Bool) -> Bool
badNot False = True
badNot True = False
This is rejected with the error::
badnot.idr:2:1--3:1:Attempt to match on erased argument False in
Main.badNot
The following, however, is fine, because in ``sNot``, even though we appear
to match on the erased argument ``x``, its value is uniquely inferrable from
the type of the second argument
.. code-block:: idris
data SBool : Bool -> Type where
SFalse : SBool False
STrue : SBool True
sNot : (0 x : Bool) -> SBool x -> Bool
sNot False SFalse = True
sNot True STrue = False
Experience with Idris 2 so far suggests that, most of the time, as long as
you're using unbound implicits in your Idris 1 programs, they will work without
much modification in Idris 2. The Idris 2 type checker will point out where you
require an unbound implicit argument at run time - sometimes this is both
surprising and enlightening!
Linearity
---------
Full details on linear arguments with multiplicity ``1`` are given in Section
:ref:`sect-multiplicities`. In brief, the intution behind multiplicity ``1`` is
that if we have a function with a type of the following form...
.. code-block:: idris
f : (1 x : a) -> b
...then the guarantee given by the type system is that *if* ``f x`` *is used
exactly once, then* ``x`` *is used exactly once* in the process.
Prelude and ``base`` libraries
==============================
Prelide is much smaller, things moved to ``base``.
The Prelude in Idris 1 contained a lot of definitions, many of them rarely
necessary. The philosophy in Idris 2 is different. The (rather vaguely
specified) rule of thumb is that it should contain the basic functions
required by almost any non-trivial program.
This is a vague specification since different programmers will consider
different things absolutely necessary, but the result is that it contains:
- Anything the elaborator can desugar to (e.g. tuples, ``()``, ``=``)
- Basic types ``Bool``, ``Nat``, ``List``, ``Stream``, ``Dec``, ``Maybe``,
``Either``
- The most important utility functions: ``id``, ``the``, composition, etc
- Interfaces for arithmetic and implementations for the primitives and
basic types
- Basic ``Char`` and ``String`` manipulation
- ``Show``, ``Eq``, ``Ord``, and implementations for all types in the prelude
- Interfaces and functions for basic proof (``cong``, ``Uninhabited``, etc)
- ``Semigroup``, ``Monoid``
- ``Functor``, ``Applicative``, ``Monad`` and related functions
- ``Foldable``, ``Alternative`` and ``Traversable``
- ``Enum``, for list range syntax
- Console ``IO``
Anything which doesn't fit in here has been moved to the ``base`` libraries.
Among other places, you can find some of the functions which used to be
in the prelude in:
- ``Data.List`` and ``Data.Nat``
- ``Data.Maybe`` and ``Data.Either``
- ``System.File`` and ``System.Directory``, (file management was previously
part of the prelude)
- ``Decidable.Equality``
Smaller Changes
===============
@ -39,27 +250,140 @@ Smaller Changes
Ambiguous Name Resolution
-------------------------
Idris 1 worked very hard to resolve ambiguous names, by type, even if this
involved some complicated interaction with interface resolution. This could
sometimes be the cause of long type checking times. Idris 2 simplifies this,
at the cost of requiring more programmer annotations on ambiguous names.
As a general rule, Idris 2 will be able to disambiguate between names which
have different concrete return types (such as data constructors), or which have
different concrete argument types (such as record projections). It may struggle
to resolve ambiguities if one name requires an interface to be resolved.
It will postpone resolution if a name can't be resolved immediately, but unlike
Idris 1, it won't attempt significant backtracking
If you get an ambiguous name error, generally the best approach is to
give the namespace explicitly.
One difficulty which remains is resolving ambiguous names where one possibility
is an interface method, and another is a concrete top level function.
For example, we may have:
.. code-block:: idris
Prelude.(>>=) : Monad m => m a -> (a -> m b) -> m b
LinearIO.(>>=) : (1 act : IO a) -> (1 k : a -> IO b) -> IO b
As a pragmatic choice, if type checking in a context where the more concrete
name is valid (``LinearIO.(>>=)`` here, so if type checking an expression known
to have type ``IO t`` for some ``t``), the more concrete name will be chosen.
This is somehow unsatisfying, so we may revisit this in future!
Modules, namespaces and export
------------------------------
Change in visibility rules, and module names must match filenames except
``Main``.
The visibility rules, as controlled by the ``private``, ``export`` and
``public export`` modifiers, now refer to the visibility of names from
other *namespaces*, rather than other *files*.
So if you have the following, all in the same file...
.. code-block:: idris
namespace A
private
aHidden : Int -> Int
aHidden x = x * 2
export
aVisible : Int -> Int
aVisibile x = aHidden x
mamespace B
export
bVisible : Int -> Int
bVisible x = aVisible (x * 2)
...then ``bVisible`` can access ``aVisible``, but not ``aHidden``.
Records are, as before, defined in their own namespace, but fields are always
visible from the parent namespace.
Also, module names must now match the filename in which they're defined, with
the exception of the module ``Main``, which can be defined in a file with
any name.
``%language`` pragmas
---------------------
Not moved to Idris 2, any extensions are not implemented.
There are several ``%language`` pragmas in Idris 1, which define various
experimental extensions. None of these are available in Idris 2, although
extensions may be defined in the future.
``let`` bindings
----------------
Now don't compute, and equivalent to ``(\x => e) val``, but see
:ref:`sect-local-defs` below.
``let`` bindings, i.e. expressions of the form ``let x = val in e`` have
slightly different behaviour. Previously, you could rely on the computational
behaviour of ``x`` in the scope of ``e``, so type checking could take into
account that ``x`` reduces to ``val``. Unfortunately, this leads to complications
with ``case`` and ``with`` clauses: if we want to preserve the computational
behaviour, we would need to make significant changes to the way ``case`` and
``with`` are elaborated.
So, for simplicity and consistency (and, realistically, because I don't have
enough time to resolve the problem for ``case`` and ``with``) the above
expression ``let x = val in e`` is equivalent to ``(\x => e) val``.
So, ``let`` now effectively generalises a complex subexpression.
If you do need the computational behaviour of a definition, it is now possible
using local function definitions instead - see Section :ref:`sect-local-defs`
below.
``auto``-implicits and Interfaces
---------------------------------
Now the same thing
Interfaces and ``auto``-implicit arguments are similar, in that they invoke an
expression search mechanism to find the value of an argument. In Idris 1,
they were implemented separately, but in Idris 2, they use the same mechanism.
Consider the following *total* definition of ``fromMaybe``:
.. code-block:: idris
data IsJust : Maybe a -> Type where
ItIsJust : IsJust (Just val)
fromMaybe : (x : Maybe a) -> {auto p : IsJust x} -> a
fromMaybe (Just x) {p = ItIsJust} = x
Since interface resolution and ``auto``-implicits are now the same thing,
the type of ``fromMaybe`` can be written as:
.. code-block:: idris
fromMaybe : (x : Maybe a) -> IsJust x => a
So now, the constraint arrow ``=>`` means that the argument will be found
by ``auto``-implicit search.
When defining a ``data`` type, there are ways to control how ``auto``-implicit
search will proceed, by giving options to the data type. For example:
.. code-block:: idris
data Elem : (x : a) -> (xs : List a) -> Type where
[search x]
Here : Elem x (x :: xs)
There : Elem x xs -> Elem x (y :: xs)
The ``search x`` option means that ``auto``-implicit search for a value of
type ``Elem t ts`` will start as soon as the type checker has resolved the
value ``t``, even if ``ts`` is still unknown.
By default, ``auto``-implicit search uses the constructors of a data type
as search hints. The ``noHints`` option on a data type turns this behaviour
off.
Totality and Coverage
---------------------
@ -70,23 +394,44 @@ planned soon!]
Build artefacts
---------------
Not really a language change, but a change in the way Idris saves checked
files. All checked modules are now saved in a directory ``build/ttc``, in the
root of the source tree, with the directory structure following the directory
structure of the source. Executables are placed in ``build/exec``.
This is not really a language change, but a change in the way Idris saves
checked files, and still useful to know. All checked modules are now saved in a
directory ``build/ttc``, in the root of the source tree, with the directory
structure following the directory structure of the source. Executables are
placed in ``build/exec``.
.. _sect-new-features:
New features
============
In addition to updating the core to QTT, described above.
As well as the change to the core language to use quantitative type theory,
described above, there are several other new features.
.. _sect-local-defs:
Local function definitions
--------------------------
Functions can now be defined locally, using a ``let`` block. For example,
``greet`` in the following example, which is defined in the scope of a local
variable ``x``:
.. code-block:: idris
chat : IO ()
chat
= do putStr "Name: "
x <- getLine
let greet : String -> String
greet msg = msg ++ " " ++ x
putStrLn (greet "Hello")
putStrLn (greet "Bye")
These ``let`` blocks can be used anywhere (in the middle of ``do`` blocks
as above, but in any function, or in type declarations).
``where`` blocks are now elaborated by translation into a local ``let``.
However, Idris no longer attempts to infer types for functions defined in
``where`` blocks, because this was fragile. This may be reinstated, if we can
come up with a good, predictable approach.
@ -94,30 +439,181 @@ come up with a good, predictable approach.
Scope of implicit arguments
---------------------------
Implicit arguments in a type are now in scope in the body of a definition.
We've already seen this above, where ``n`` is in scope automatically in the
body of ``vlen``:
.. code-block:: idris
vlen : {n : Nat} -> Vect n a -> Nat
vlen xs = n
This is important to remember when using ``where`` blocks, or local definitions,
since the names in scope will also be in scope when declaring the *type* of
the local definition. For example, the following definition, where we attempt
to define our own version of ``Show`` for ``Vect``, will fail to type check:
.. code-block:: idris
showVect : Show a => Vect n a -> String
showVect xs = "[" ++ showBody xs ++ "]"
where
showBody : Vect n a -> String
showBody [] = ""
showBody [x] = show x
showBody (x :: xs) = show x ++ ", " ++ showBody xs
This fails because ``n`` is in scope already in the type declaration for
``showBody``, and so the first clause ``showBody []`` will fail to type check
because ``[]`` has length ``Z``, not ``n``. We can fix this by locally
binding ``n``:
.. code-block:: idris
showVect : Show a => Vect n a -> String
showVect xs = "[" ++ showBody xs ++ "]"
where
showBody : forall n . Vect n a -> String
...
Or, alternatively, using a new name:
.. code-block:: idris
showVect : Show a => Vect n a -> String
showVect xs = "[" ++ showBody xs ++ "]"
where
showBody : Vect n' a -> String
...
Idris 1 took a different approach here: names which were parameters to data
types were in scope, other names were not. The Idris 2 approach is, we hope,
more consistent and easier to understand.
Better inference
----------------
Holes global to a source file, case works better.
In Idris 1, holes (that is, unification variables arising from implicit
arguments) were local to an expression, and if they were not resolved while
checking the expression, they would not be resolved at all. In Idris 2,
they are global, so inference works better. For example, we can now say:
.. code-block:: idris
test : Vect ? Int
test = [1,2,3,4]
Main> :t test
Main.test : Vect (S (S (S (S Z)))) Int
The ``?``, incidentally, differs from ``_`` in that ``_`` will be bound as
an implicit argument if unresolved after checking the type of ``test``, but
``?`` will be left as a hole to be resolved later. Otherwise, they can be
used interchangeably.
Dependent case
--------------
``case`` blocks were available in Idris 1, but with some restrictions. Having
better inference means that ``case`` blocks work more effectively in Idris 2,
and dependent case analysis is supported.
.. code-block:: idris
append : Vect n a -> Vect m a -> Vect (n + m) a
append xs ys
= case xs of
[] => ys
(x :: xs) => x :: append xs ys
The implicit arguments and original values are still available in the body of
the ``case``. Somewhat contrived, but the following is valid:
.. code-block:: idris
info : {n : _} -> Vect n a -> (Vect n a, Nat)
info xs
= case xs of
[] => ([], n)
(y :: ys) => (xs, n)
Record updates
--------------
Dependent record updates now work, as long as all relevant fields are updated
at once.
Dependent record updates work, provided that as all relevant fields are updated
at the same time. Dependent record update is implemented via ``case`` blocks
rather than by generating a specific update function for each field as in
Idris 1, so you will no longer get mystifying errors when trying to update
dependent records!
For example, we can wrap a vector in a record, with an explicit length
field:
.. code-block:: idris
record WrapVect a where
constructor MkVect
purpose : String
length : Nat
content : Vect len a
Then, we can safely update the ``content``, provided we update then ``length``
correspondingly:
.. code-block:: idris
addEntry : String -> WrapVect String -> WrapVect String
addEntry val = record { len $= S,
content $= (val :: ) }
Generate definition
-------------------
A new feature of the IDE protocol, generating complete definitions from a
type signature.
type signature. You can try this at the REPL, for example, given our
favourite introductory example...
.. code-block:: idris
append : Vect n a -> Vect m a -> Vect (n + m) a
...assuming this is defined on line 3, you can use the ``:gd`` command as
follows:
.. code-block:: idris
Main> :gd 3 append
append [] ys = ys
append (x :: xs) ys = x :: append xs ys
This is a fairly simple brute force search, which tries searching for a valid
right hand side, and case splitting on the left if that fails, but is
remarkably effective in a lot of situations. Some other examples which work:
.. code-block:: idris
my_cong : forall f . (x : a) -> (y : a) -> x = y -> f x = f y
my_curry : ((a, b) -> c) -> a -> b -> c
my_uncurry : (a -> b -> c) -> (a, b) -> c
append : Vect n a -> Vect m a -> Vect (n + m) a
lappend : (1 xs : List a) -> (1 ys : List a) -> List a
zipWith : (a -> b -> c) -> Vect n a -> Vect n b -> Vect n c
This is available in the IDE protocol via the ``generate-def`` command.
Chez Scheme target
------------------
Also, optionally, Racket, which itself can compile via Chez scheme. Early
experience shows that both are much faster than the Idris 1 C code generator,
in both compile time and execution time (but we haven't done any formal
study on this yet, so it's just anecdotal evidence).
The default code generator is, for the moment,
`Chez Scheme <https://www.scheme.com/>`_. A Racket code generator is also available. There is not yet
a way to plug in code generators as in Idris 1, but this is coming. To change
code generate, you can use the ``:set cg`` command:
::
Main> :set cg racket
Early experience shows that both are much faster than the Idris 1 C code
generator, in both compile time and execution time (but we haven't done any
formal study on this yet, so it's just anecdotal evidence).