diff --git a/docs/typedd/typedd.rst b/docs/typedd/typedd.rst index 0cd9613..296689a 100644 --- a/docs/typedd/typedd.rst +++ b/docs/typedd/typedd.rst @@ -3,11 +3,13 @@ Type Driven Development with Idris: Updates Required ==================================================== -The code in the book "Type Driven Development with Idris" by Edwin Brady, -published by Manning, will mostly work in Idris 2, with some small changes -as detailed in this document. The updated code is also [going to be] part -of the test suite (see `tests/typedd-book `_ -in the Idris 2 source). +The code in the book `Type-Driven Development with Idris +`_ by Edwin +Brady, available from `Manning `_, will mostly work +in Idris 2, with some small changes as detailed in this document. The updated +code is also [going to be] part of the test suite (see `tests/typedd-book +`_ in the Idris +2 source). Chapter 1 --------- diff --git a/docs/updates/updates.rst b/docs/updates/updates.rst index 79743f9..3f7b6ba 100644 --- a/docs/updates/updates.rst +++ b/docs/updates/updates.rst @@ -9,6 +9,10 @@ exceptions. This document describes the changes, approximately in order of likelihood of encountering them in practice. New features are described at the end, in Section :ref:`sect-new-features`. +The Section :ref:`typedd-index` describes how these changes affect the code +in the book `Type-Driven Development with Idris `_ +by Edwin Brady, available from `Manning `_. + [NOT YET COMPLETE] .. note:: @@ -22,16 +26,223 @@ the end, in Section :ref:`sect-new-features`. New Core Language: Quantities in Types ====================================== +Idris 2 is based on `Quantitative Type Theory (QTT) +`_, a core language +developed by Bob Atkey and Conor McBride. In practice, this means that every +variable in Idris 2 has a *quantity* associated with it. A quantity is one of: + +* ``0``, meaning that the variable is *erased* at run time +* ``1``, meaning that the variable is used *exactly once* at run time +* *Unrestricted*, which is the same behaviour as Idris 1 + +For more details on this, see Section :ref:`sect-multiplicities`. In practice, +this might cause some Idris 1 programs not to type check in Idris 2 due to +attempting to use an argument which is erased at run time. + Erasure ------- +In Idris, names which begin with a lower case later are automatically bound +as implicit arguments in types, for example in the following skeleton +definition, ``n``, ``a`` and ``m`` are implicitly bound: + +.. code-block:: idris + + append : Vect n a -> Vect m a -> Vect (n + m) a + append xs ys = ?append_rhs + +One of the difficulties in compiling a dependently typed programming language +is deciding which arguments are used at run time and which can safely be +erased. More importantly, it's one of the difficulties when programming: how +can a programmer *know* when an argument will be erased? + +In Idris 2, a variable's quantity tells us whether it will be available at +run time or not. We can see the quantities of the variables in scope in +``append_rhs`` by inspecting the hole at the REPL: + +:: + + Main> :t append_rhs + 0 m : Nat + 0 a : Type + 0 n : Nat + ys : Vect m a + xs : Vect n a + ------------------------------------- + append_rhs : Vect (plus n m) a + +The ``0`` next to ``m``, ``a`` and ``n`` mean that they are in scope, but there +will be ``0`` occurrences at run-time. That is, it is *guaranteed* that they +will be erased at run-time. + +This does lead to some potential difficulties when converting Idris 1 programs, +if you are using implicit arguments at run time. For example, in Idris 1 you +can get the length of a vector as follows: + +.. code-block:: idris + + vlen : Vect n a -> Nat + vlen {n} xs = n + +This might seem like a good idea, since it runs in constant time and takes +advantage of the type level information, but the trade off is that ``n`` has to +be available at run time, so at run time we always need the length of the +vector to be available if we ever call ``vlen``. Idris 1 can infer whether the +length is needed, but there's no easy way for a programmer to be sure. + +In Idris 2, we need to state explicitly that ``n`` is needed at run time + +.. code-block:: idris + + vlen : {n : Nat} -> Vect n a -> Nat + vlen xs = n + +(Incidentally, also note that in Idris 2, names bound in types are also available +in the definition without explicitly rebinding them.) + +This also means that when you call ``vlen``, you need the length available. For +example, this will give an error + +.. code-block:: idris + + sumLengths : Vect m a -> Vect n a —> Nat + sumLengths xs ys = vlen xs + vlen ys + +Idris 2 reports:: + + vlen.idr:7:20--7:28:While processing right hand side of Main.sumLengths at vlen.idr:7:1--10:1: + m is not accessible in this context + +This means that it needs to use ``m`` as an argument to pass to ``vlen xs``, +where it needs to be available at run time, but ``m`` is not available in +``sumLengths`` because it has multiplicity ``0``. + +We can see this more clearly by replacing the right hand side of +``sumLengths`` with a hole... + +.. code-block:: idris + + sumLengths : Vect m a -> Vect n a -> Nat + sumLengths xs ys = ?sumLengths_rhs + +...then checking the hole's type at the REPL:: + + Main> :t sumLengths_rhs + 0 n : Nat + 0 a : Type + 0 m : Nat + ys : Vect n a + xs : Vect m a + ------------------------------------- + sumLengths_rhs : Nat + +Instead, we need to give bindings for ``m`` and ``n`` with +unrestricted multiplicity + +.. code-block:: idris + + sumLengths : {m, n : _} -> Vect m a -> Vect n a —> Nat + sumLengths xs ys = vLen xs + vLen xs + +Remember that giving no multiplicity on a binder, as with ``m`` and ``n`` here, +means that the variable has unrestricted usage. + +If you're converting Idris 1 programs to work with Idris 2, this is probably the +biggest thing you need to think about. It is important to note, +though, that if you have bound implicits, such as... + +.. code-block:: idris + + excitingFn : {t : _} -> Coffee t -> Moonbase t + +...then it's a good idea to make sure ``t`` really is needed, or performance +might suffer due to the run time building the instance of ``t`` unnecessarily! + +One final note on erasure: it is an error to try to pattern match on an +argument with multiplicity ``0``, unless its value is inferrable from +elsewhere. So, the following definition is rejected + +.. code-block:: idris + + badNot : (0 x : Bool) -> Bool + badNot False = True + badNot True = False + +This is rejected with the error:: + + badnot.idr:2:1--3:1:Attempt to match on erased argument False in + Main.badNot + +The following, however, is fine, because in ``sNot``, even though we appear +to match on the erased argument ``x``, its value is uniquely inferrable from +the type of the second argument + +.. code-block:: idris + + data SBool : Bool -> Type where + SFalse : SBool False + STrue : SBool True + + sNot : (0 x : Bool) -> SBool x -> Bool + sNot False SFalse = True + sNot True STrue = False + +Experience with Idris 2 so far suggests that, most of the time, as long as +you're using unbound implicits in your Idris 1 programs, they will work without +much modification in Idris 2. The Idris 2 type checker will point out where you +require an unbound implicit argument at run time - sometimes this is both +surprising and enlightening! + + Linearity --------- +Full details on linear arguments with multiplicity ``1`` are given in Section +:ref:`sect-multiplicities`. In brief, the intution behind multiplicity ``1`` is +that if we have a function with a type of the following form... + +.. code-block:: idris + + f : (1 x : a) -> b + +...then the guarantee given by the type system is that *if* ``f x`` *is used +exactly once, then* ``x`` *is used exactly once* in the process. + Prelude and ``base`` libraries ============================== -Prelide is much smaller, things moved to ``base``. +The Prelude in Idris 1 contained a lot of definitions, many of them rarely +necessary. The philosophy in Idris 2 is different. The (rather vaguely +specified) rule of thumb is that it should contain the basic functions +required by almost any non-trivial program. + +This is a vague specification since different programmers will consider +different things absolutely necessary, but the result is that it contains: + +- Anything the elaborator can desugar to (e.g. tuples, ``()``, ``=``) +- Basic types ``Bool``, ``Nat``, ``List``, ``Stream``, ``Dec``, ``Maybe``, + ``Either`` +- The most important utility functions: ``id``, ``the``, composition, etc +- Interfaces for arithmetic and implementations for the primitives and + basic types +- Basic ``Char`` and ``String`` manipulation +- ``Show``, ``Eq``, ``Ord``, and implementations for all types in the prelude +- Interfaces and functions for basic proof (``cong``, ``Uninhabited``, etc) +- ``Semigroup``, ``Monoid`` +- ``Functor``, ``Applicative``, ``Monad`` and related functions +- ``Foldable``, ``Alternative`` and ``Traversable`` +- ``Enum``, for list range syntax +- Console ``IO`` + +Anything which doesn't fit in here has been moved to the ``base`` libraries. +Among other places, you can find some of the functions which used to be +in the prelude in: + +- ``Data.List`` and ``Data.Nat`` +- ``Data.Maybe`` and ``Data.Either`` +- ``System.File`` and ``System.Directory``, (file management was previously + part of the prelude) +- ``Decidable.Equality`` Smaller Changes =============== @@ -39,27 +250,140 @@ Smaller Changes Ambiguous Name Resolution ------------------------- +Idris 1 worked very hard to resolve ambiguous names, by type, even if this +involved some complicated interaction with interface resolution. This could +sometimes be the cause of long type checking times. Idris 2 simplifies this, +at the cost of requiring more programmer annotations on ambiguous names. + +As a general rule, Idris 2 will be able to disambiguate between names which +have different concrete return types (such as data constructors), or which have +different concrete argument types (such as record projections). It may struggle +to resolve ambiguities if one name requires an interface to be resolved. +It will postpone resolution if a name can't be resolved immediately, but unlike +Idris 1, it won't attempt significant backtracking + +If you get an ambiguous name error, generally the best approach is to +give the namespace explicitly. + +One difficulty which remains is resolving ambiguous names where one possibility +is an interface method, and another is a concrete top level function. +For example, we may have: + +.. code-block:: idris + + Prelude.(>>=) : Monad m => m a -> (a -> m b) -> m b + LinearIO.(>>=) : (1 act : IO a) -> (1 k : a -> IO b) -> IO b + +As a pragmatic choice, if type checking in a context where the more concrete +name is valid (``LinearIO.(>>=)`` here, so if type checking an expression known +to have type ``IO t`` for some ``t``), the more concrete name will be chosen. + +This is somehow unsatisfying, so we may revisit this in future! + Modules, namespaces and export ------------------------------ -Change in visibility rules, and module names must match filenames except -``Main``. +The visibility rules, as controlled by the ``private``, ``export`` and +``public export`` modifiers, now refer to the visibility of names from +other *namespaces*, rather than other *files*. + +So if you have the following, all in the same file... + +.. code-block:: idris + + namespace A + private + aHidden : Int -> Int + aHidden x = x * 2 + + export + aVisible : Int -> Int + aVisibile x = aHidden x + + mamespace B + export + bVisible : Int -> Int + bVisible x = aVisible (x * 2) + +...then ``bVisible`` can access ``aVisible``, but not ``aHidden``. + +Records are, as before, defined in their own namespace, but fields are always +visible from the parent namespace. + +Also, module names must now match the filename in which they're defined, with +the exception of the module ``Main``, which can be defined in a file with +any name. ``%language`` pragmas --------------------- -Not moved to Idris 2, any extensions are not implemented. +There are several ``%language`` pragmas in Idris 1, which define various +experimental extensions. None of these are available in Idris 2, although +extensions may be defined in the future. ``let`` bindings ---------------- -Now don't compute, and equivalent to ``(\x => e) val``, but see -:ref:`sect-local-defs` below. +``let`` bindings, i.e. expressions of the form ``let x = val in e`` have +slightly different behaviour. Previously, you could rely on the computational +behaviour of ``x`` in the scope of ``e``, so type checking could take into +account that ``x`` reduces to ``val``. Unfortunately, this leads to complications +with ``case`` and ``with`` clauses: if we want to preserve the computational +behaviour, we would need to make significant changes to the way ``case`` and +``with`` are elaborated. + +So, for simplicity and consistency (and, realistically, because I don't have +enough time to resolve the problem for ``case`` and ``with``) the above +expression ``let x = val in e`` is equivalent to ``(\x => e) val``. + +So, ``let`` now effectively generalises a complex subexpression. +If you do need the computational behaviour of a definition, it is now possible +using local function definitions instead - see Section :ref:`sect-local-defs` +below. ``auto``-implicits and Interfaces --------------------------------- -Now the same thing +Interfaces and ``auto``-implicit arguments are similar, in that they invoke an +expression search mechanism to find the value of an argument. In Idris 1, +they were implemented separately, but in Idris 2, they use the same mechanism. +Consider the following *total* definition of ``fromMaybe``: + +.. code-block:: idris + + data IsJust : Maybe a -> Type where + ItIsJust : IsJust (Just val) + + fromMaybe : (x : Maybe a) -> {auto p : IsJust x} -> a + fromMaybe (Just x) {p = ItIsJust} = x + +Since interface resolution and ``auto``-implicits are now the same thing, +the type of ``fromMaybe`` can be written as: + +.. code-block:: idris + + fromMaybe : (x : Maybe a) -> IsJust x => a + +So now, the constraint arrow ``=>`` means that the argument will be found +by ``auto``-implicit search. + +When defining a ``data`` type, there are ways to control how ``auto``-implicit +search will proceed, by giving options to the data type. For example: + +.. code-block:: idris + + data Elem : (x : a) -> (xs : List a) -> Type where + [search x] + Here : Elem x (x :: xs) + There : Elem x xs -> Elem x (y :: xs) + +The ``search x`` option means that ``auto``-implicit search for a value of +type ``Elem t ts`` will start as soon as the type checker has resolved the +value ``t``, even if ``ts`` is still unknown. + +By default, ``auto``-implicit search uses the constructors of a data type +as search hints. The ``noHints`` option on a data type turns this behaviour +off. Totality and Coverage --------------------- @@ -70,23 +394,44 @@ planned soon!] Build artefacts --------------- -Not really a language change, but a change in the way Idris saves checked -files. All checked modules are now saved in a directory ``build/ttc``, in the -root of the source tree, with the directory structure following the directory -structure of the source. Executables are placed in ``build/exec``. +This is not really a language change, but a change in the way Idris saves +checked files, and still useful to know. All checked modules are now saved in a +directory ``build/ttc``, in the root of the source tree, with the directory +structure following the directory structure of the source. Executables are +placed in ``build/exec``. .. _sect-new-features: New features ============ -In addition to updating the core to QTT, described above. +As well as the change to the core language to use quantitative type theory, +described above, there are several other new features. .. _sect-local-defs: Local function definitions -------------------------- +Functions can now be defined locally, using a ``let`` block. For example, +``greet`` in the following example, which is defined in the scope of a local +variable ``x``: + +.. code-block:: idris + + chat : IO () + chat + = do putStr "Name: " + x <- getLine + let greet : String -> String + greet msg = msg ++ " " ++ x + putStrLn (greet "Hello") + putStrLn (greet "Bye") + +These ``let`` blocks can be used anywhere (in the middle of ``do`` blocks +as above, but in any function, or in type declarations). +``where`` blocks are now elaborated by translation into a local ``let``. + However, Idris no longer attempts to infer types for functions defined in ``where`` blocks, because this was fragile. This may be reinstated, if we can come up with a good, predictable approach. @@ -94,30 +439,181 @@ come up with a good, predictable approach. Scope of implicit arguments --------------------------- +Implicit arguments in a type are now in scope in the body of a definition. +We've already seen this above, where ``n`` is in scope automatically in the +body of ``vlen``: + +.. code-block:: idris + + vlen : {n : Nat} -> Vect n a -> Nat + vlen xs = n + +This is important to remember when using ``where`` blocks, or local definitions, +since the names in scope will also be in scope when declaring the *type* of +the local definition. For example, the following definition, where we attempt +to define our own version of ``Show`` for ``Vect``, will fail to type check: + +.. code-block:: idris + + showVect : Show a => Vect n a -> String + showVect xs = "[" ++ showBody xs ++ "]" + where + showBody : Vect n a -> String + showBody [] = "" + showBody [x] = show x + showBody (x :: xs) = show x ++ ", " ++ showBody xs + +This fails because ``n`` is in scope already in the type declaration for +``showBody``, and so the first clause ``showBody []`` will fail to type check +because ``[]`` has length ``Z``, not ``n``. We can fix this by locally +binding ``n``: + +.. code-block:: idris + + showVect : Show a => Vect n a -> String + showVect xs = "[" ++ showBody xs ++ "]" + where + showBody : forall n . Vect n a -> String + ... + +Or, alternatively, using a new name: + +.. code-block:: idris + + showVect : Show a => Vect n a -> String + showVect xs = "[" ++ showBody xs ++ "]" + where + showBody : Vect n' a -> String + ... + +Idris 1 took a different approach here: names which were parameters to data +types were in scope, other names were not. The Idris 2 approach is, we hope, +more consistent and easier to understand. + Better inference ---------------- -Holes global to a source file, case works better. +In Idris 1, holes (that is, unification variables arising from implicit +arguments) were local to an expression, and if they were not resolved while +checking the expression, they would not be resolved at all. In Idris 2, +they are global, so inference works better. For example, we can now say: + +.. code-block:: idris + + test : Vect ? Int + test = [1,2,3,4] + + Main> :t test + Main.test : Vect (S (S (S (S Z)))) Int + +The ``?``, incidentally, differs from ``_`` in that ``_`` will be bound as +an implicit argument if unresolved after checking the type of ``test``, but +``?`` will be left as a hole to be resolved later. Otherwise, they can be +used interchangeably. Dependent case -------------- +``case`` blocks were available in Idris 1, but with some restrictions. Having +better inference means that ``case`` blocks work more effectively in Idris 2, +and dependent case analysis is supported. + +.. code-block:: idris + + append : Vect n a -> Vect m a -> Vect (n + m) a + append xs ys + = case xs of + [] => ys + (x :: xs) => x :: append xs ys + +The implicit arguments and original values are still available in the body of +the ``case``. Somewhat contrived, but the following is valid: + +.. code-block:: idris + + info : {n : _} -> Vect n a -> (Vect n a, Nat) + info xs + = case xs of + [] => ([], n) + (y :: ys) => (xs, n) + + Record updates -------------- -Dependent record updates now work, as long as all relevant fields are updated -at once. +Dependent record updates work, provided that as all relevant fields are updated +at the same time. Dependent record update is implemented via ``case`` blocks +rather than by generating a specific update function for each field as in +Idris 1, so you will no longer get mystifying errors when trying to update +dependent records! + +For example, we can wrap a vector in a record, with an explicit length +field: + +.. code-block:: idris + + record WrapVect a where + constructor MkVect + purpose : String + length : Nat + content : Vect len a + +Then, we can safely update the ``content``, provided we update then ``length`` +correspondingly: + +.. code-block:: idris + + addEntry : String -> WrapVect String -> WrapVect String + addEntry val = record { len $= S, + content $= (val :: ) } Generate definition ------------------- A new feature of the IDE protocol, generating complete definitions from a -type signature. +type signature. You can try this at the REPL, for example, given our +favourite introductory example... + +.. code-block:: idris + + append : Vect n a -> Vect m a -> Vect (n + m) a + +...assuming this is defined on line 3, you can use the ``:gd`` command as +follows: + +.. code-block:: idris + + Main> :gd 3 append + append [] ys = ys + append (x :: xs) ys = x :: append xs ys + +This is a fairly simple brute force search, which tries searching for a valid +right hand side, and case splitting on the left if that fails, but is +remarkably effective in a lot of situations. Some other examples which work: + +.. code-block:: idris + + my_cong : forall f . (x : a) -> (y : a) -> x = y -> f x = f y + my_curry : ((a, b) -> c) -> a -> b -> c + my_uncurry : (a -> b -> c) -> (a, b) -> c + append : Vect n a -> Vect m a -> Vect (n + m) a + lappend : (1 xs : List a) -> (1 ys : List a) -> List a + zipWith : (a -> b -> c) -> Vect n a -> Vect n b -> Vect n c + +This is available in the IDE protocol via the ``generate-def`` command. Chez Scheme target ------------------ -Also, optionally, Racket, which itself can compile via Chez scheme. Early -experience shows that both are much faster than the Idris 1 C code generator, -in both compile time and execution time (but we haven't done any formal -study on this yet, so it's just anecdotal evidence). +The default code generator is, for the moment, +`Chez Scheme `_. A Racket code generator is also available. There is not yet +a way to plug in code generators as in Idris 1, but this is coming. To change +code generate, you can use the ``:set cg`` command: + +:: + + Main> :set cg racket + +Early experience shows that both are much faster than the Idris 1 C code +generator, in both compile time and execution time (but we haven't done any +formal study on this yet, so it's just anecdotal evidence).