1
1
mirror of https://github.com/tweag/nickel.git synced 2024-10-05 15:47:33 +03:00
nickel/doc/manual/typing.md

560 lines
22 KiB
Markdown
Raw Normal View History

# Typing in Nickel
## Introduction
Static typing or dynamic typing? Here comes the eternal debate. While the idea
that static typing is pretty much a necessity for large codebases of
general-purpose programming languages is well established, the case of an
interpreted configuration language may appear less clear-cut.
Nevertheless, who has ever faced puzzling dynamic type errors (as in [Nix]())
may have felt the strong need for something better. Standard dynamic typing is
prone to error messages being unrelated to the actual issue while being located
far from the offending code. This is especially salient when working with
functions, which tend to delay type errors by passing around ill-formed values
until they eventually break evaluation somewhere else.
On the other hand, for pure configuration code, static typing is less useful.
First, a configuration is a terminating program run once on fixed inputs: here,
basic type errors will show up at runtime anyway. What's more, Nickel has a
powerful validation system, contracts, that can do the same job as types, and
more.
For those reasons, Nickel takes an hybrid approach called *gradual typing*.
Gradual typing enables to mix both static typing and dynamic typing.
You should at least skim at this document first, but once you have even a
partial understanding of how typing works, you should look at the practical
guide [Type versus contracts: when to?](./types-vs-contracts.md) if what you
seek is knowing when and how to annotate your code.
## Typing modes
### Untyped by default
By default, Nickel code is assumed to be configuration code. It is thus run in
*untyped* mode (understand dynamically typed).
Example:
```nickel
{
name = "hello",
version = "0.1.1",
fullname =
if builtins.isNum version then
"hello-v#{strings.fromNum version}"
else
"hello-#{version}",
}
```
While this is fine for configuration code, especially when finally checked
against a contract, it doesn't work so well once we are using functions.
```
let filter = fun pred l =>
lists.foldl (fun acc x => if pred x then acc @ [x] else acc) [] l in
filter (fun x => if x % 2 == 0 then x else null) [1,2,3,4,5,6]
```
Result:
```
error: Type error
┌─ repl-input-11:2:32
2 │ lists.foldl (fun acc x => if pred x then acc @ [x] else acc) [] l in
│ ^^^^^^ This expression has type Num, but Bool was expected
3 │ filter (fun x => if x % 2 == 0 then x else null) [1,2,3,4,5,6]
│ - evaluated to this
= if
```
This example shows how dynamic typing delays type errors, making them harder to
diagnose. Here, `filter` is fine, but the error still points to inside its
implementation. The actual issue is that the caller provided an argument of the
wrong type: the function should return a boolean, but returns either the
original element or `null`. This is a tiny example, so it may be workable
nonetheless. Alas, in a real code base, the user (who probably wouldn't even be
the author of `filter` in practice) might have a hard time reaching the right
conclusion from the error report.
### Typed blocks
The `filter` example is the poster child for static typing. The typechecker will
catch the error early as the type expected by `filter` and the return type of
the filtering function passed as the argument don't match .
To call the typechecker to the rescue, we use `:` to introduce a *type
annotation*. This annotation switches the typechecker on for the annotated
expression, be it a variable definition, a record field or any expression via an
inline annotation. We will refer to such an annotated expression as a *typed
block*.
Examples:
```
// Let binding
let f : Num -> Bool = fun x => x % 2 == 0 in
// Record field
let r = {
count : Num = 2354.45 * 4 + 100,
} in
// Inline
1 + ((if f 10 then 1 else 0) : Num)
```
Let us try on the filter example. We want the call to be inside the typechecked
block. The easiest way is to add an annotation at the top-level:
```nickel
(let filter | (Num -> Bool) -> List Num -> List Num = fun pred l =>
lists.foldl (fun acc x => if pred x then acc @ [x] else acc) [] l in
filter (fun x => if x % 2 == 0 then x else null) [1,2,3,4,5,6]) : List Num
```
Result:
```
error: Incompatible types
┌─ repl-input-12:3:37
3 │ filter (fun x => if x % 2 == 0 then x else null) [1,2,3,4,5,6]) : List Num
│ ^ this expression
= The type of the expression was expected to be `Bool`
= The type of the expression was inferred to be `Num`
= These types are not compatible
```
That's already better! The error now points at the call site, and inside our
anonymous functions, telling us it is expected to return a boolean. What's more,
we just had to give the top-level annotation `List Num`. Nickel performs type
inference, so that you don't have to write the type for `filter`, the filtering
function nor the list. Although there's a twist about type inference and
polymorphism (see the [dedicated section](#Polymorphism)), but we'll come back
to this later.
**Take-away**: Nickel takes an hybrid approach to typing that mixes both static
typing and dynamic typing. The default is dynamic typing. The static typechecker
kicks in when using a type annotation `exp : Type`. This delimits a typed block.
Nickel has type inference, sparing you writing unnecessary type annotations.
Now, let us have a quick tour of the type system.
## Type system
The basic types are:
- `Dyn`: the dynamic type. This is the type given to most expressions outside of
a typed block. A value of type `Dyn` can be pretty much anything.
- `Num`: the only number type. Currently implemented as a 64bits float.
- `Str`: a string.
- `Bool`: a boolean, that is either `true` or `false`.
<!-- - `Lbl`: a contract label. You usually don't need to use it or worry about it, -->
<!-- it is more of an internal thing. -->
The following type constructors are available:
- **List**: `List T`. A list of elements of type `T`. When no `T` is specified, `List`
alone is an alias for `List Dyn`.
Example:
```nickel
let x : List (List Num) = [[1,2], [3,4]] in
lists.flatten x : List Num
```
- **Record**: `{field1: T1, .., fieldn: Tn}`. A record whose field
names are known statically as `field1`, .., `fieldn`, respectively of type
`T1`, .., `Tn`.
Example:
```nickel
let pair : {fst: Num, snd: Str} = {fst = 1, snd = "a"} in
pair.fst : Num
```
- **Dynamic record**: `{_: T}`. A record whose field
names are statically unknown but are all of type `T`. Typically used to model
dictionaries.
Example:
```nickel
let occurences : {_: Num} = {a = 1, b = 3, c = 0} in
records.map (fun char count => count+1) occurences : {_ : Num}
```
- **Enum**: `<tag1, .., tagn>`: an enumeration comprised of alternatives `tag1`,
.., `tagn`. An enumeration literal is prefixed with a backtick and serialized
as a string. It is useful to encode finite alternatives. The advantage over
strings is that the typechecker handles them more finely: it is able to detect
incomplete matches, for example.
Example:
```nickel
let protocol : <http, ftp, sftp> = `http in
(switch {
`http => 1,
`ftp => 2,
`sftp => 3
} protocol) : Num
```
- **Arrow (function)**: `S -> T`. A function taking arguments of type `S` and returning a value of
type `T`. For multi-parameters functions, just iterate the arrow constructor.
Example:
```nickel
{
incr : Num -> Num = fun x => x + 1,
mkPath : Str -> Str -> Str -> Str = fun basepath filename ext =>
"#{basepath}/#{filename}.#{ext}",
}
```
### Polymorphism (generics)
#### Plain polymorphism
Let us try to write a type annotation for our `filter` function:
```nickel
(let filter = ... in
filter ...) : List Num
```
What should it be? In this specific example, the type inferred by the
typechecker is `(Num -> Bool) -> List Num -> List Num`, guessed from the
application to the arguments. Thus, if we try to use `filter` on something else,
we get in trouble:
```nickel
nickel>
(let filter = ... in
let result = filter (fun x => x % 2 == 0) [1,2,3,4,5,6] in
let dummy = filter (fun s => strings.length s > 2) ["a","ab","abcd"] in
result) : List Num
```
Result:
```
error: Incompatible types
┌─ repl-input-35:2:37
2 │ let dummy = filter (fun s => strings.length s > 2) ["a","ab","abcd"] in
│ ^ this expression
= The type of the expression was expected to be `Str`
= The type of the expression was inferred to be `Num`
= These types are not compatible
```
That's too bad, because in practice the code of `filter` is agnostic with
respect to the type of elements of the list. That is, `filter` is *generic*.
This can be expressed in Nickel with so-called *parametric polymorphism*. This
is really just a pedantic name for some flavour of generics. Generic parameters
are introduced by the keyword `forall`, and can then be used as any other type.
We can now fix our example:
```nickel
(let filter : forall a. (a -> Bool) -> List a -> List a in
let result = filter (fun x => x % 2 == 0) [1,2,3,4,5,6] in
let dummy = filter (fun s => strings.length s > 2) ["a","ab","abcd"] in
result) : List Num
```
Result:
```
[2,4,6]
```
And now it works! `forall a. (a -> Bool) -> List a -> List a` means it is OK to
substitute `a` for any type. You can use as many generic parameters as you need:
```nickel
let fst : forall a b. a -> b -> a = fun x y => x in
let snd : forall a b. a -> b -> b = fun x y => y in
{ n = fst 1 "a", s = snd 1 "a" } : {n: Num, s: Str}
```
Or even nest them:
```nickel
let higherRankId : forall a. (forall b. b -> b) -> a -> a
= fun id x => id x in
let id : forall a. a -> a
= fun x => x in
higherRankId id 0 : Num
```
**Remark**: if you are a more type-inclined reader out there, you may wonder why
the typechecker is not capable of inferring a polymorphic type for `filter` by
itself. Indeed, Hindley-Milner type-inference can precisely infer heading
`foralls`, such that the first example without the polymorphic annotation would
already pass. We chose to abandon this so-called automatic generalization,
because it makes other aspects of the type system and type inference more
complex or limited. Requiring annotation of polymorphic functions seems like a
good practice and an acceptable price to pay in exchange of making the type
system simpler and more easily extensible, in a non type-heavy configuration
language.
#### Row polymorphism
In a configuration language, you will often find yourselves handling records of
various kinds. In a simple type system, you can hit the following issue:
```nickel
(let addTotal: {total: Num} -> {total: Num} -> Num
= fun r1 r2 => r1.total + r2.total in
let r1 = {jan = 200, feb = 300, march = 10, total = jan + feb} in
let r2 = {aug = 50, sept = 20, total = aug + sept} in
let r3 = {may = 1300, june = 400, total = may + june} in
{
partial1 = addTotal r1 r2,
partial2 = addTotal r2 r3,
}) : {partial1: Num, partial2: Num}
```
```
error: Type error: extra row `sept`
┌─ repl-input-40:8:23
8 │ partial2 = addTotal r2 r3,
│ ^^ this expression
= The type of the expression was expected to be `{total: Num}`, which does not contain the field `sept`
= The type of the expression was inferred to be `{total: Num, sept: Num, aug: Num}`, which contains the extra field `sept`
```
The problem here is that for this code to run fine, the requirement of
`addTotal` should be that both arguments have a field `total: Num`, but could
very well have other fields, for all we care. Unfortunately, we don't know right
now how to express this constraint. The current annotation is too restrictive,
because it imposes that arguments have exactly one field `total: Num`, and
nothing more.
To express such constraints, Nickel features *row polymorphism*. The idea is
similar to polymorphism, but instead substituting a parameter for a single type,
we can substitute a parameter for a whole sequence of field declarations, also
referred as rows. Let us first fix our working example:
```nickel
(let addTotal: forall a b. {total: Num | a} -> {total: Num | b} -> Num
= fun r1 r2 => r1.total + r2.total in
let r1 = {jan = 200, feb = 300, march = 10, total = jan + feb} in
let r2 = {aug = 50, sept = 20, total = aug + sept} in
let r3 = {may = 1300, june = 400, total = may + june} in
{
partial1 = addTotal r1 r2,
partial2 = addTotal r2 r3,
}) : {partial1: Num, partial2: Num}
```
Result:
```
{partial1 = 570, partial2 = 1770}
```
In the type of `addTotal`, the part `{total: Num | a}` expresses exactly what we
wanted: the argument must have a field `total: Num`, but the *tail* (the rest of
the record type) is generic, and `a` may be substituted for an arbitrary field
lists (like `jan: Num, feb: Num`). We used two different generic parameters `a`
and `b`, to express that the tails of the arguments may differ. If we used `a`
in both places, as in `forall a. {total: Num | a} -> {total: Num | a} -> Num`,
we could still write `addTotal {total = 1, foo = 1} {total = 2, foo = 2}` but
`addTotal {total = 1, foo = 1} {total = 2, bar = 2}` would be rejected. Using
dinstinct parameters `a` and `b` gives us maximum flexibility.
What comes before the tail may include several fields, is in e.g. `forall a.
{total: Num, subtotal: Num | a} -> Num`.
Note that row polymorphism also works with enums, with the same intuition of a
tail that can be substituted for something else. For example:
```nickel
let portOf : forall a. <http, ftp | a> -> Num = fun protocol =>
switch {
`http -> 80,
`ftp -> 21,
_ -> 8000,
} protocol
```
**Take-away**: The type system of Nickel has a few basic types (`Dyn`, `Num`, `Str`,
and `Bool`) and type constructors for lists, records, enums and functions.
Nickel features generics via polymorphism, introduced by the `forall` keyword. A
type can not only be generic in other types, but records and enums types can
also be generic in their tail. The tail is delimited by `|`.
In this section, we've been focusing solely on the static typing side. We'll now
see how typed and untyped code interact.
## Interaction between typed and untyped code
### Using typed code inside untyped code
I invite to you come back to our first love, the `filter` example. In the
previous section, we included everything -- both the definition of `filter` and
the call site -- in a typed block. More realistically, `filter` would be a typed
library function (it is actually part of the stdlib as `lists.filter`), but is
likely to be called from untyped configuration code. Without an additional
mechanism, this would just ensure that the implementation of `filter` is without
flaws, but since the call site still escapes the typechecker, we may end up with
the very same initial problem. Remember, the caller passes an unchecked, invalid
value that is raising an error from within `filter`.
Hopefully, Nickel prevents us from this happening. Let us see by ourselves:
```nickel
lists.filter (fun x => if x % 2 == 0 then x else null) [1,2,3,4,5,6]
```
Result:
```
error: Blame error: contract broken by the caller.
┌─ :1:17
1 │ forall a. (a -> Bool) -> List a -> List a
│ ---- expected return type of a function provided by the caller
┌─ repl-input-45:1:67
1 │ lists.filter (fun x => if x % 2 == 0 then x else null) [1,2,3,4,5,6]
│ - evaluated to this expression
= This error may happen in the following situation:
1. A function `f` is bound by a contract: e.g. `(Num -> Num) -> Num`.
2. `f` takes another function `g` as an argument: e.g. `f = fun g => g 0`.
3. `f` is called by with an argument `g` that does not respect the contract: e.g. `f (fun x => false)`.
= Either change the contract accordingly, or call `f` with a function that returns a value of the right type.
= Note: this is an illustrative example. The actual error may involve deeper nested functions calls.
note:
┌─ <stdlib/lists>:160:14
160 │ filter : forall a. (a -> Bool) -> List a -> List a
│ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ bound here
[...]
note:
┌─ repl-input-45:1:1
1 │ lists.filter (fun x => if x % 2 == 0 then x else null) [1,2,3,4,5,6]
│ -------------------------------------------------------------------- (3) calling <func>
```
We call `filter` from an untyped location, but still get a spot-on error.
What's is happening? The interpreter protects typed block by a
[contract](./contracts.md), which is a runtime check. Please refer to the
dedicated documentation for more details. For now, you can just remember that
*any type annotation*, wherever it is, gives rise at runtime to a corresponding
contract check. In other words, `foo: T` and `foo | T` (here `|` is contract
application, not the row tail separator) behave exactly the same at *runtime*.
This approach of protecting typed code from untyped code is known as *sound
gradual typing*.
Thanks to this, you can type your library functions and use them from untyped
configuration code while still getting good error messages.
### Using untyped code inside typed code
In the other direction, we face a different issue. How can we use untyped values
inside typed code? Let us try:
```
let x = 1 in
(1 + x : Num)
```
Although `x` should be untyped, it works! Hmm. What about:
```
let x = 0 + 1 in
(1 + x : Num)
```
Now, it doesn't, as we would expect originally. As it is the case here, you can
use an identifier defined in untyped code from within a typed block. Thus, the
typechecker needs to assign a type to every identifier, even outside of a typed
block. But it also tries to obey the will of the programmer. If one doesn't use
annotations, then the code shouldn't be typechecked, either for performance
reasons, because it can't be well-typed, etc. In the end, the reason doesn't
matter: if you want `x` to be typed, you should annotate it.
The typechecker still tries its best not to be too stupid either. It is obvious
in the first case that `1` is of type `Num`. This information is cheap to
gather. Thus, when encountering a binding outside of a typed block, the
typechecker determines the *apparent type* of the definition. The rationale is
that determining the apparent type shouldn't recurse arbitrarily inside the
expression or do anything non-trivial. Typically, our previous `0 + 1` is a
compound expression with a primitive operator, so the apparent type is just
`Dyn`. For now, the typechecker determine an apparent type that is not `Dyn` for
literals (numbers, strings, booleans), variables, and annotated expressions.
Otherwise, the typechecker fallbacks to `Dyn`. In the future, it could also
infer `Dyn -> Dyn` for functions, `{_: Dyn}` for records, and so on. As of now
(INSERT VERSION HERE), it doesn't.
We could add a type annotation to `x`. But sometimes we don't want to, or we
can't. Maybe we don't want the typechecker to do its ceremony and recurse in the
whole expression. Maybe the expression is not typeable in Nickel, but we still
know it will evaluate to a number. In any of these situations, we can trade the
type annotation for a contract application:
Exmaple:
```
let x | Num = if true then 0 else "a" in
(1 + x : Num)
```
Here, `x` is clearly always a number, but it is not well-typed (the `then` and
`else` branches of an `if` must have the same type). Nonetheless, this program
is accepted! The rationale is that because we inserted a contract application,
the typechecker knows that if `x` is not a number, the program will fail early
with a detailed contract error. Thus, if we reach `1 + x`, `x` is necessarily a
number at this point, and everything will be fine. In a way, the contract
application acts like a type cast, but whose verification is delayed to
run-time.
When using a contract annotation inside typed code, the *typechecker is turned
off again*. Contract application is dual to type annotation with respect to
typechecking. This is illustrated by the following variation being accepted:
```nickel
(1 + ((if true then 0 else "a" | Num)) : Num
```
While, as said before, this one is rejected:
```nickel
(1 + (if true then 0 else "a")) : Num
```
Result:
```
error: Incompatible types
┌─ repl-input-46:1:27
1 │ (1 + (if true then 0 else "a")) : Num
│ ^^^ this expression
= The type of the expression was expected to be `Num`
= The type of the expression was inferred to be `Str`
= These types are not compatible
```
**Take-away**: When calling to typed code from untyped code, Nickel
automatically inserts contract checks at the boundary to enjoy clearer and
earlier error reporting. In the other direction, an expression `exp | Type` is
blindly accepted to be of type `Type` by the typechecker. This is a way of using
untyped values inside typed code by telling the typechecker "trust me on this
one, and if I'm wrong there will be a contract error anyway". Finally, while a
type annotation switches the typechecker on, a contract annotation switches it
back off.
If you read until this point, you may feel confused. We have static types on one
side and we have contracts on the other, but type annotations and contract
applications actually behave the same at runtime. Although contracts seem to be
a runtime thing, contract applications also have an effect on the typechecker.
What's more, they look like they cover similar use cases (check that some value
has a specific shape or behavior), and the annotations are written in the same
type syntax, differing only in the symbols `:` and `|`. How do we know when to
use one or the other in practice? This is what the guide [Type versus
contracts: when to?](./types-vs-contracts.md) is for!