8.9 KiB
This document explains how to add builtins to the language by working
through the example of adding MVar
and some associated functions.
Builtin Data
The logical first step for this example is to add a built-in MVar
type, whose values will simply be wrapped values of the Haskell type
with the same name. The 'old' runtime deviates from this approach for
several types, but this is how e.g. Text
works even there.
Data types, including opaque pseudo data types of this sort are
referred to by Reference
. Builtin, opaque data types use the
Builtin
constructor with an appropriate name. The ones in actual
use are listed in the Unison.Type
module, so we'll add a definition
there:
mvarRef :: Reference
mvarRef = Reference.Builtin "MVar"
This definition alone won't do anything, however. It is merely
something for other definitions to refer to. If the reference is used
in e.g. the type of a function definitions without giving it an actual
name in the codebase, the type will be displayed with the raw hash,
which looks like #MVar
.
The builtin reference can be given a name during the builtins.merge
ucm command. To make this happen, we must modify the builtinTypesSrc
definition in the Unison.Builtin
module. This is just a list of
values that describe various builtin type related actions to be
performed during that command. In this case, we will add two values to
the list:
B' "MVar" CT.Data
This specifies that there should be a builtin data type referring to
the Builtin "MVar"
reference. The codebase name assigned to this is
the same as the reference (MVar here), but nested in the builtin
namespace. However, we will also add the value:
Rename' "MVar" "io2.MVar"
because this is a type to be used with the new IO functions, which are
currently nested under the io2
namespace. With both of these added
to the list, running builtins.merge
should have a builtin.io2.MVar
type referring to the Builtin "MVar"
reference.
The reason for both a B'
and a Rename'
is that eventually one
would expect the IO functionality to be moved from the io2
namespace. However, the builtin reference name may not be changed
easily, so it is preferable to have it named in the eventual expected
way, rather than permanently named io2.MVar
internally.
Builtin function declarations
The next step is to declare builtin functions that make use of the new
type. These are declared in a similar way to the type names above.
There is another list in Unison.Builtin
, builtinsSrc
, that defines
values specifying what builtin functions should exist.
Like the builtin type list, there are declarations for adding a
builtin function with a given name, and declarations for renaming from
the given name to a different namespace location. For the MVar
functions, we'll again give them their intended names as the original,
and rename them to the io2
namespace for the time being.
Builtin functions also have an associated type as part of the initial declaration. So for the complete specification of a function, we will add declarations similar to:
B "MVar.new" $ forall1 "a" (\a -> a --> io (mvar a))
Rename "MVar.new" "io2.MVar.new"
B "MVar.take" $ forall1 "a" (\a -> mvar a --> ioe a)
Rename "MVar.take" "io2.MVar.take"
The forall1
, io
, ioe
and -->
functions are local definitions
in Unison.Builtin
for assistance in writing the types. ioe
indicates that an error result may be returned, while io
should
always succeed. mvar
can be defined locally using some other
helpers in scope:
mvar :: Var v => Type v -> Type v
mvar a = Type.ref () Type.mvarRef `app` a
For the actual MVar
implementation, we'll be doing many definitions
followed by renames, so it'll be factored into a list of the name and
type, together with a function that generates the declaration and the
rename.
Builtin function implementation -- new runtime
What we have done so far only declares the functions and their types.
There is nothing yet implementing them. This section will proceed
through the implementation backing the declarations of the MVar.new
and MVar.take
above.
In this case, we will implement the operations using the 'foreign
function' machinery. This path is somewhat less optimized, but
doesn't require inventing opcodes and modifying the runtime at
quite as low a level. The builtin 'foreign' functions are declared
in Unison.Runtime.Builtin
, in a definition declareForeigns
. We
can declare our builtins there by adding:
declareForeign "MVar.new" mvar'new
. mkForeign $ \(c :: Closure) -> newMVar c
declareForeign "MVar.take" mvar'take
. mkForeignIOE $ \(mv :: MVar Closure) -> takeMVar mv
These lines do multiple things at once. The first argument to
declareForeign
must match the name from Unison.Builtin
, as this
is how they are associated. The second argument is wrapper code
that actually defines the unison function that will be called, and
the definitions for these two cases will be shown later. The last
argument is the actual Haskell implementation of the operation.
However, the format for foreign functions is somewhat more limited
than 'any Haskell function,' so the mkForeign
and mkForeignIOE
helpers assist in wrapping Haskell functions correctly. The latter
will catch some exceptions and yield them as explicit results.
The wrapper code for these two operations looks like:
mvar'new :: ForeignOp
mvar'new instr
= ([BX],)
. TAbs init
$ TFOp instr [init]
where
[init] = freshes 1
mvar'take :: ForeignOp
mvar'take instr
= ([BX],)
. TAbs mv
$ io'error'result'direct instr [mv] ior e r
where
[mv,ior,e,r] = freshes 4
The breakdown of what is happening here is as follows:
instr
is an identifier that is used to decouple the wrapper code from the actual Haskell implementation functions. It is made up indeclareForeign
and passed to the wrapper to use as a sort of instruction code.- A
ForeignOp
may take many arguments, and the list in the tuple section specifies the calling convention for them.[BX]
means one boxed argument, which in this case is the value of typea
.[BX,BX]
would be two boxed arguments, and[BX,UN]
would be one boxed and one unboxed argument. Builtin wrappers will currently be taking all boxed arguments, because there is no way to talk about unboxed values in the surface syntax where they are called. TAbs init
abstracts the argument variable, which we got fromfreshes'
at the bottom. Multiple arguments may be abstracted with e.g.TAbss [x,y,z]
io'error'result'direct
is a helper function for calling the instruction and wrapping up a possible error result. The first argument is the identifier to call, the list is the arguments, and the last three arguments are variables used in the common result handling code.TFOp
simply calls the instruction with the assumption that the result value is acceptable for directly returning.MVar
values will be represented directly by their Haskell values wrapped into a closure, so themvar'new
code doesn't need to do any processing of the results of its foreign function.
Other builtins use slightly different implementations, so looking at other parts of the file may be instructive, depending on what is being added.
At first, our declarations will cause an error, because some of the
automatic machinery for creating builtin 'foreign' functions does not
exist for MVar
. To rectify this, we can add a ForeignConvention
instance in Unison.Runtime.Foreign.Function
that specifies how to
automatically marshal MVar Closure
, which is the representation
we'll be using.
instance ForeignConvention (MVar Closure) where
readForeign = readForeignAs (unwrapForeign . marshalToForeign)
writeForeign = writeForeignAs (Foreign . Wrap mvarRef)
This takes advantage of the Closure
instance, and uses helper
functions that apply (un)wrappers from another convention.
With these in place, the functions should now be usable in the new runtime.
Decompilation
If it makes sense for an added type, it is possible to add to Unison's
ability to decompile runtime values or test for universal
equality/ordering. Directly embedded Haskell types are wrapped in the
Foreign
type, and are decompiled in Unison.Runtime.Decompile
using
the decompileForeign
function. For instance, Text
is decompiled in
the case:
| Just t <- maybeUnwrapBuiltin f = Right $ text () t
Further cases may be added using the maybeUnwrapBuiltin
, which just
requires adding an instance to the BuiltinForeign
class in
Unison.Runtime.Foreign
, specifying which builtin reference
corresponds to the type.
Transcripts
One last thing remains. The additional builtin operations will have changed some of the transcript output. The transcript runner should be executed, and modified files should be checked and committed, so that CI tests will pass (which check transcripts against an expected result).