mirror of
https://github.com/idris-lang/Idris2.git
synced 2025-01-05 15:08:00 +03:00
574 lines
18 KiB
ReStructuredText
574 lines
18 KiB
ReStructuredText
.. _sect-multiplicities:
|
|
|
|
**************
|
|
Multiplicities
|
|
**************
|
|
|
|
Idris 2 is
|
|
based on `Quantitative Type Theory (QTT)
|
|
<https://bentnib.org/quantitative-type-theory.html>`_, a core language
|
|
developed by Bob Atkey and Conor McBride. In practice, this means that every
|
|
variable in Idris 2 has a *quantity* associated with it. A quantity is one of:
|
|
|
|
* ``0``, meaning that the variable is *erased* at run time
|
|
* ``1``, meaning that the variable is used *exactly once* at run time
|
|
* *Unrestricted*, which is the same behaviour as Idris 1
|
|
|
|
We can see the multiplicities of variables by inspecting holes. For example,
|
|
if we have the following skeleton definition of ``append`` on vectors...
|
|
|
|
.. code-block:: idris
|
|
|
|
append : Vect n a -> Vect m a -> Vect (n + m) a
|
|
append xs ys = ?append_rhs
|
|
|
|
...we can look at the hole ``append_rhs``:
|
|
|
|
::
|
|
|
|
Main> :t append_rhs
|
|
0 m : Nat
|
|
0 a : Type
|
|
0 n : Nat
|
|
ys : Vect m a
|
|
xs : Vect n a
|
|
-------------------------------------
|
|
append_rhs : Vect (plus n m) a
|
|
|
|
The ``0`` next to ``m``, ``a`` and ``n`` mean that they are in scope, but there
|
|
will be ``0`` occurrences at run-time. That is, it is **guaranteed** that they
|
|
will be erased at run-time.
|
|
|
|
Multiplicities can be explicitly written in function types as follows:
|
|
|
|
* ``ignoreN : (0 n : Nat) -> Vect n a -> Nat`` - this function cannot look at
|
|
``n`` at run time
|
|
* ``duplicate : (1 x : a) -> (a, a)`` - this function must use ``x`` exactly
|
|
once (so good luck implementing it, by the way. There is no implementation
|
|
because it would need to use ``x`` twice!)
|
|
|
|
If there is no multiplicity annotation, the argument is unrestricted.
|
|
If, on the other hand, a name is implicitly bound (like ``a`` in both examples above)
|
|
the argument is erased. So, the above types could also be written as:
|
|
|
|
* ``ignoreN : {0 a : _} -> (0 n : Nat) -> Vect n a -> Nat``
|
|
* ``duplicate : {0 a : _} -> (1 x : a) -> (a, a)``
|
|
|
|
This section describes what this means for your Idris 2 programs in practice,
|
|
with several examples. In particular, it describes:
|
|
|
|
* :ref:`sect-erasure` - how to know what is relevant at run time and what is erased
|
|
* :ref:`sect-linearity` - using the type system to encode *resource usage protocols*
|
|
* :ref:`sect-pmtypes` - truly first class types
|
|
|
|
The most important of these for most programs, and the thing you'll need to
|
|
know about if converting Idris 1 programs to work with Idris 2, is erasure_.
|
|
The most interesting, however, and the thing which gives Idris 2 much more
|
|
expressivity, is linearity_, so we'll start there.
|
|
|
|
.. _sect-linearity:
|
|
|
|
Linearity
|
|
---------
|
|
|
|
The ``1`` multiplicity expresses that a variable must be used exactly once.
|
|
First, we'll see how this works on some small examples of functions and
|
|
data types, then see how it can be used to encode `resource protocols`_.
|
|
|
|
Above, we saw the type of ``duplicate``. Let's try to write it interactively,
|
|
and see what goes wrong. We'll start by giving the type and a skeleton
|
|
definition with a hole
|
|
|
|
.. code-block:: idris
|
|
|
|
duplicate : (1 x : a) -> (a, a)
|
|
duplicate x = ?help
|
|
|
|
Checking the type of a hole tells us the multiplicity of each variable in
|
|
scope. If we check the type of ``?help`` we'll see that we can't
|
|
use ``a`` at run time, and we have to use ``x`` exactly once::
|
|
|
|
Main> :t help
|
|
0 a : Type
|
|
1 x : a
|
|
-------------------------------------
|
|
help : (a, a)
|
|
|
|
If we use ``x`` for one part of the pair...
|
|
|
|
.. code-block:: idris
|
|
|
|
duplicate : (1 x : a) -> (a, a)
|
|
duplicate x = (x, ?help)
|
|
|
|
...then the type of the remaining hole tells us we can't use it for the other::
|
|
|
|
Main> :t help
|
|
0 a : Type
|
|
0 x : a
|
|
-------------------------------------
|
|
help : a
|
|
|
|
The same happens if we try defining ``duplicate x = (?help, x)`` (try it!).
|
|
|
|
In order to avoid parsing ambiguities, if you give an explicit multiplicity
|
|
for a variable as with the argument to ``duplicate``, you need to give it
|
|
a name too. But, if the name isn't used in the scope of the type, you
|
|
can use ``_`` instead of a name, as follows:
|
|
|
|
.. code-block:: idris
|
|
|
|
duplicate : (1 _ : a) -> (a, a)
|
|
|
|
The intution behind multiplicity ``1`` is that if we have a function with
|
|
a type of the following form...
|
|
|
|
.. code-block:: idris
|
|
|
|
f : (1 x : a) -> b
|
|
|
|
...then the guarantee given by the type system is that *if* ``f x`` *is used
|
|
exactly once, then* ``x`` *is used exactly once*. So, if we insist on
|
|
trying to define ``duplicate``...::
|
|
|
|
duplicate x = (x, x)
|
|
|
|
...then Idris will complain::
|
|
|
|
pmtype.idr:2:15--8:1:While processing right hand side of Main.duplicate at pmtype.idr:2:1--8:1:
|
|
There are 2 uses of linear name x
|
|
|
|
A similar intuition applies for data types. Consider the following types,
|
|
``Lin`` which wraps an argument that must be used once, and ``Unr`` which
|
|
wraps an argument with unrestricted use
|
|
|
|
.. code-block:: idris
|
|
|
|
data Lin : Type -> Type where
|
|
MkLin : (1 _ : a) -> Lin a
|
|
|
|
data Unr : Type -> Type where
|
|
MkUnr : a -> Unr a
|
|
|
|
If ``MkLin x`` is used once, then ``x`` is used once. But if ``MkUnr x`` is
|
|
used once, there is no guarantee on how often ``x`` is used. We can see this a
|
|
bit more clearly by starting to write projection functions for ``Lin`` and
|
|
``Unr`` to extract the argument
|
|
|
|
.. code-block:: idris
|
|
|
|
getLin : (1 _ : Lin a) -> a
|
|
getLin (MkLin x) = ?howmanyLin
|
|
|
|
getUnr : (1 _ : Unr a) -> a
|
|
getUnr (MkUnr x) = ?howmanyUnr
|
|
|
|
Checking the types of the holes shows us that, for ``getLin``, we must use
|
|
``x`` exactly once (Because the ``val`` argument is used once,
|
|
by pattern matching on it as ``MkLin x``, and if ``MkLin x`` is used once,
|
|
``x`` must be used once)::
|
|
|
|
Main> :t howmanyLin
|
|
0 a : Type
|
|
1 x : a
|
|
-------------------------------------
|
|
howmanyLin : a
|
|
|
|
For ``getUnr``, however, we still have to use ``val`` once, again by pattern
|
|
matching on it, but using ``MkUnr x`` once doesn't place any restrictions on
|
|
``x``. So, ``x`` has unrestricted use in the body of ``getUnr``::
|
|
|
|
Main> :t howmanyUnr
|
|
0 a : Type
|
|
x : a
|
|
-------------------------------------
|
|
howmanyUnr : a
|
|
|
|
If ``getLin`` has an unrestricted argument...
|
|
|
|
.. code-block:: idris
|
|
|
|
getLin : Lin a -> a
|
|
getLin (MkLin x) = ?howmanyLin
|
|
|
|
...then ``x`` is unrestricted in ``howmanyLin``::
|
|
|
|
Main> :t howmanyLin
|
|
0 a : Type
|
|
x : a
|
|
-------------------------------------
|
|
howmanyLin : a
|
|
|
|
Remember the intuition from the type of ``MkLin`` is that if ``MkLin x`` is
|
|
used exactly once, ``x`` is used exactly once. But, we didn't say that
|
|
``MkLin x`` would be used exactly once, so there is no restriction on ``x``.
|
|
|
|
Resource protocols
|
|
~~~~~~~~~~~~~~~~~~
|
|
|
|
One way to take advantage of being able to express linear usage of an argument
|
|
is in defining resource usage protocols, where we can use linearity to ensure
|
|
that any unique external resource has only one instance, and we can use
|
|
functions which are linear in their arguments to represent state transitions on
|
|
that resource. A door, for example, can be in one of two states, ``Open`` or
|
|
``Closed``
|
|
|
|
.. code-block:: idris
|
|
|
|
data DoorState = Open | Closed
|
|
|
|
data Door : DoorState -> Type where
|
|
MkDoor : (doorId : Int) -> Door st
|
|
|
|
(Okay, we're just pretending here - imagine the ``doorId`` is a reference
|
|
to an external resource!)
|
|
|
|
We can define functions for opening and closing the door which explicitly
|
|
describe how they change the state of a door, and that they are linear in
|
|
the door
|
|
|
|
.. code-block:: idris
|
|
|
|
openDoor : (1 d : Door Closed) -> Door Open
|
|
closeDoor : (1 d : Door Open) -> Door Closed
|
|
|
|
Remember, the intuition is that if ``openDoor d`` is used exactly once,
|
|
then ``d`` is used exactly once. So, provided that a door ``d`` has
|
|
multiplicity ``1`` when it's created, we *know* that once we call
|
|
``openDoor`` on it, we won't be able to use ``d`` again. Given that
|
|
``d`` is an external resource, and ``openDoor`` has changed it's state,
|
|
this is a good thing!
|
|
|
|
We can ensure that any door we create has multiplicity ``1`` by
|
|
creating them with a ``newDoor`` function with the following type
|
|
|
|
.. code-block:: idris
|
|
|
|
newDoor : (1 p : (1 d : Door Closed) -> IO ()) -> IO ()
|
|
|
|
That is, ``newDoor`` takes a function, which it runs exactly once. That
|
|
function takes a door, which is used exactly once. We'll run it in
|
|
``IO`` to suggest that there is some interaction with the outside world
|
|
going on when we create the door. Since the multiplicity ``1`` means the
|
|
door has to be used exactly once, we need to be able to delete the door
|
|
when we're finished
|
|
|
|
.. code-block:: idris
|
|
|
|
deleteDoor : (1 d : Door Closed) -> IO ()
|
|
|
|
So an example correct door protocol usage would be
|
|
|
|
.. code-block:: idris
|
|
|
|
doorProg : IO ()
|
|
doorProg
|
|
= newDoor $ \d =>
|
|
let d' = openDoor d
|
|
d'' = closeDoor d' in
|
|
deleteDoor d''
|
|
|
|
It's instructive to build this program interactively, with holes along
|
|
the way, and see how the multiplicities of ``d``, ``d'`` etc change. For
|
|
example
|
|
|
|
.. code-block:: idris
|
|
|
|
doorProg : IO ()
|
|
doorProg
|
|
= newDoor $ \d =>
|
|
let d' = openDoor d in
|
|
?whatnow
|
|
|
|
Checking the type of ``?whatnow`` shows that ``d`` is now spent, but we
|
|
still have to use ``d'`` exactly once::
|
|
|
|
Main> :t whatnow
|
|
0 d : Door Closed
|
|
1 d' : Door Open
|
|
-------------------------------------
|
|
whatnow : IO ()
|
|
|
|
Note that the ``0`` multiplicity for ``d`` means that we can still *talk*
|
|
about it - in particular, we can still reason about it in types - but we
|
|
can't use it again in a relevant position in the rest of the program.
|
|
It's also fine to shadow the name ``d`` throughout
|
|
|
|
.. code-block:: idris
|
|
|
|
doorProg : IO ()
|
|
doorProg
|
|
= newDoor $ \d =>
|
|
let d = openDoor d
|
|
d = closeDoor d in
|
|
deleteDoor d
|
|
|
|
If we don't follow the protocol correctly - create the door, open it, close
|
|
it, then delete it - then the program won't type check. For example, we
|
|
can try not to delete the door before finishing
|
|
|
|
.. code-block:: idris
|
|
|
|
doorProg : IO ()
|
|
doorProg
|
|
= newDoor $ \d =>
|
|
let d' = openDoor d
|
|
d'' = closeDoor d' in
|
|
putStrLn "What could possibly go wrong?"
|
|
|
|
This gives the following error::
|
|
|
|
Door.idr:15:19--15:38:While processing right hand side of Main.doorProg at Door.idr:13:1--17:1:
|
|
There are 0 uses of linear name d''
|
|
|
|
There's a lot more to be said about the details here! But, this shows at
|
|
a high level how we can use linearity to capture resource usage protocols
|
|
at the type level. If we have an external resource which is guaranteed to
|
|
be used linearly, like ``Door``, we don't need to run operations on that
|
|
resource in an ``IO`` monad, since we're already enforcing an ordering on
|
|
operations and don't have access to any out of date resource states. This is
|
|
similar to the way interactive programs work in
|
|
`the Clean programming language <https://clean.cs.ru.nl/Clean>`_, and in
|
|
fact is how ``IO`` is implemented internally in Idris 2, with a special
|
|
``%World`` type for representing the state of the outside world that is
|
|
always used linearly
|
|
|
|
.. code-block:: idris
|
|
|
|
public export
|
|
data IORes : Type -> Type where
|
|
MkIORes : (result : a) -> (1 x : %World) -> IORes a
|
|
|
|
export
|
|
data IO : Type -> Type where
|
|
MkIO : (1 fn : (1 x : %World) -> IORes a) -> IO a
|
|
|
|
Having multiplicities in the type system raises several interesting
|
|
questions, such as:
|
|
|
|
* Can we use linearity information to inform memory management and, for
|
|
example, have type level guarantees about functions which will not need
|
|
to perform garbage collection?
|
|
* How should multiplicities be incorporated into interfaces such as
|
|
``Functor``, ``Applicative`` and ``Monad``?
|
|
* If we have ``0``, and ``1`` as multiplicities, why stop there? Why not have
|
|
``2``, ``3`` and more (like `Granule
|
|
<https://granule-project.github.io/granule.html>`_)
|
|
* What about multiplicity polymorphism, as in the `Linear Haskell proposal <https://arxiv.org/abs/1710.09756>`_?
|
|
* Even without all of that, what can we do *now*?
|
|
|
|
.. _sect-erasure:
|
|
|
|
Erasure
|
|
-------
|
|
|
|
The ``1`` multiplicity give us many possibilities in the kinds of
|
|
properties we can express. But, the ``0`` multiplicity is perhaps more
|
|
important in that it allows us to be precise about which values are
|
|
relevant at run time, and which are compile time only (that is, which are
|
|
erased). Using the ``0`` multiplicity means a function's type now tells us
|
|
exactly what it needs at run time.
|
|
|
|
For example, in Idris 1 you could get the length of a vector as follows
|
|
|
|
.. code-block:: idris
|
|
|
|
vlen : Vect n a -> Nat
|
|
vlen {n} xs = n
|
|
|
|
This is fine, since it runs in constant time, but the trade off is that
|
|
``n`` has to be available at run time, so at run time we always need the length
|
|
of the vector to be available if we ever call ``vlen``. Idris 1 can infer whether
|
|
the length is needed, but there's no easy way for a programmer to be sure.
|
|
|
|
In Idris 2, we need to state explicitly that ``n`` is needed at run time
|
|
|
|
.. code-block:: idris
|
|
|
|
vlen : {n : Nat} -> Vect n a -> Nat
|
|
vlen xs = n
|
|
|
|
(Incidentally, also note that in Idris 2, names bound in types are also available
|
|
in the definition without explicitly rebinding them.)
|
|
|
|
This also means that when you call ``vlen``, you need the length available. For
|
|
example, this will give an error
|
|
|
|
.. code-block:: idris
|
|
|
|
sumLengths : Vect m a -> Vect n a —> Nat
|
|
sumLengths xs ys = vlen xs + vlen ys
|
|
|
|
Idris 2 reports::
|
|
|
|
vlen.idr:7:20--7:28:While processing right hand side of Main.sumLengths at vlen.idr:7:1--10:1:
|
|
m is not accessible in this context
|
|
|
|
This means that it needs to use ``m`` as an argument to pass to ``vlen xs``,
|
|
where it needs to be available at run time, but ``m`` is not available in
|
|
``sumLengths`` because it has multiplicity ``0``.
|
|
|
|
We can see this more clearly by replacing the right hand side of
|
|
``sumLengths`` with a hole...
|
|
|
|
.. code-block:: idris
|
|
|
|
sumLengths : Vect m a -> Vect n a -> Nat
|
|
sumLengths xs ys = ?sumLengths_rhs
|
|
|
|
...then checking the hole's type at the REPL::
|
|
|
|
Main> :t sumLengths_rhs
|
|
0 n : Nat
|
|
0 a : Type
|
|
0 m : Nat
|
|
ys : Vect n a
|
|
xs : Vect m a
|
|
-------------------------------------
|
|
sumLengths_rhs : Nat
|
|
|
|
Instead, we need to give bindings for ``m`` and ``n`` with
|
|
unrestricted multiplicity
|
|
|
|
.. code-block:: idris
|
|
|
|
sumLengths : {m, n : _} -> Vect m a -> Vect n a —> Nat
|
|
sumLengths xs ys = vlen xs + vlen xs
|
|
|
|
Remember that giving no multiplicity on a binder, as with ``m`` and ``n`` here,
|
|
means that the variable has unrestricted usage.
|
|
|
|
If you're converting Idris 1 programs to work with Idris 2, this is probably the
|
|
biggest thing you need to think about. It is important to note,
|
|
though, that if you have bound implicits, such as...
|
|
|
|
.. code-block:: idris
|
|
|
|
excitingFn : {t : _} -> Coffee t -> Moonbase t
|
|
|
|
...then it's a good idea to make sure ``t`` really is needed, or performance
|
|
might suffer due to the run time building the instance of ``t`` unnecessarily!
|
|
|
|
One final note on erasure: it is an error to try to pattern match on an
|
|
argument with multiplicity ``0``, unless its value is inferrable from
|
|
elsewhere. So, the following definition is rejected
|
|
|
|
.. code-block:: idris
|
|
|
|
badNot : (0 x : Bool) -> Bool
|
|
badNot False = True
|
|
badNot True = False
|
|
|
|
This is rejected with the error::
|
|
|
|
badnot.idr:2:1--3:1:Attempt to match on erased argument False in
|
|
Main.badNot
|
|
|
|
The following, however, is fine, because in ``sNot``, even though we appear
|
|
to match on the erased argument ``x``, its value is uniquely inferrable from
|
|
the type of the second argument
|
|
|
|
.. code-block:: idris
|
|
|
|
data SBool : Bool -> Type where
|
|
SFalse : SBool False
|
|
STrue : SBool True
|
|
|
|
sNot : (0 x : Bool) -> SBool x -> Bool
|
|
sNot False SFalse = True
|
|
sNot True STrue = False
|
|
|
|
Experience with Idris 2 so far suggests that, most of the time, as long as
|
|
you're using unbound implicits in your Idris 1 programs, they will work without
|
|
much modification in Idris 2. The Idris 2 type checker will point out where you
|
|
require an unbound implicit argument at run time - sometimes this is both
|
|
surprising and enlightening!
|
|
|
|
.. _sect-pmtypes:
|
|
|
|
Pattern Matching on Types
|
|
-------------------------
|
|
|
|
One way to think about dependent types is to think of them as "first class"
|
|
objects in the language, in that they can be assigned to variables,
|
|
passed around and returned from functions, just like any other construct.
|
|
But, if they're truly first class, we should be able to pattern match on
|
|
them too! Idris 2 allows us to do this. For example
|
|
|
|
.. code-block:: idris
|
|
|
|
showType : Type -> String
|
|
showType Int = "Int"
|
|
showType (List a) = "List of " ++ showType a
|
|
showType _ = "something else"
|
|
|
|
We can try this as follows::
|
|
|
|
Main> showType Int
|
|
"Int"
|
|
Main> showType (List Int)
|
|
"List of Int"
|
|
Main> showType (List Bool)
|
|
"List of something else"
|
|
|
|
Pattern matching on function types is interesting, because the return type
|
|
may depend on the input value. For example, let's add a case to
|
|
``showType``
|
|
|
|
.. code-block:: idris
|
|
|
|
showType (Nat -> a) = ?help
|
|
|
|
Inspecting the type of ``help`` tells us::
|
|
|
|
Main> :t help
|
|
a : Nat -> Type
|
|
-------------------------------------
|
|
help : String
|
|
|
|
So, the return type ``a`` depends on the input value of type ``Nat``, and
|
|
we'll need to come up with a value to use ``a``, for example
|
|
|
|
.. code-block:: idris
|
|
|
|
showType (Nat -> a) = "Function from Nat to " ++ showType (a Z)
|
|
|
|
Note that multiplicities on the binders, and the ability to pattern match
|
|
on *non-erased* types mean that the following two types are distinct
|
|
|
|
.. code-block:: idris
|
|
|
|
id : a -> a
|
|
notId : {a : Type} -> a -> a
|
|
|
|
In the case of ``notId``, we can match on ``a`` and get a function which
|
|
is certainly not the identity function
|
|
|
|
.. code-block:: idris
|
|
|
|
notId {a = Int} x = x + 1
|
|
notId x = x
|
|
|
|
::
|
|
|
|
Main> notId 93
|
|
94
|
|
Main> notId "???"
|
|
"???"
|
|
|
|
There is an important consequence of being able to distinguish between relevant
|
|
and irrelevant type arguments, which is that a function is *only* parametric in
|
|
``a`` if ``a`` has multiplicity ``0``. So, in the case of ``notId``, ``a`` is
|
|
*not* a parameter, and so we can't draw any conclusions about the way the
|
|
function will behave because it is polymorphic, because the type tells us it
|
|
might pattern match on ``a``.
|
|
|
|
On the other hand, it is merely a coincidence that, in non-dependently typed
|
|
languages, types are *irrelevant* and get erased, and values are *relevant*
|
|
and remain at run time. Idris 2, being based on QTT, allows us to make the
|
|
distinction between relevant and irrelevant arguments precise. Types can be
|
|
relevant, values (such as the ``n`` index to vectors) can be irrelevant.
|
|
|
|
For more details on multiplicities,
|
|
see `Idris 2: Quantitative Type Theory in Action <https://www.type-driven.org.uk/edwinb/idris-2-quantitative-type-theory-in-action.html>`_.
|