mirror of https://github.com/tweag/nickel.git synced 2024-10-05 15:47:33 +03:00

Yann Hamdaoui 30f3e942c6 Another pass on the section on typing

2022-01-03 18:47:08 +01:00

26 KiB

Raw Blame History

Typing in Nickel

Introduction

Usually, static typing brings in important benefits for large codebases of general-purpose programming languages, but the case of an interpreted configuration language appears less clear-cut.

For pure configuration code, which is mostly data, static typing is not as useful. First, a configuration is a terminating program run once on fixed inputs: here, basic type errors will show up at runtime anyway. Second, Nickel has a powerful validation system, contracts, that can do the same job as types and more.

Nevertheless, if you have ever faced puzzling dynamic type errors, you may have felt the need for something better. Classic dynamic typing is prone to error messages being unrelated to the actual issue and pointing to a location far from the offending code. This is especially salient when working with functions, which tend to delay type errors by passing around ill-formed values until they eventually break evaluation somewhere else. For reusable code, i.e. functions, static typing really helps.

This apparent dilemma is solved in Nickel by supporting gradual typing. Gradual typing enables to mix both static typing and dynamic typing.

The following is a detailed exposition of this gradual type system. If you are rather looking for a cheat-sheet about when to use static typing or contracts, please visit Type versus contracts: when to?.

Typing modes

Dynamic typing

By default, Nickel code is dynamically typed. For example:

{
  name = "hello",
  version = "0.1.1",
  fullname =
    if builtins.is_num version then
      "hello-v#{strings.fromNum version}"
    else
      "hello-#{version}",
}

As long as we operate on basic data (numbers, strings, etc.), dynamic type error can be sufficient. Let us introduce an error on the last line of the previous example:

{
  name = "hello",
  version = "0.1.1",
  fullname =
    if builtins.is_num version then
      "hello-v#{strings.fromNum version}"
    else
      "hello-#{version + 1}",
}

version is a string, and can't be added to a number. If we try to export this configuration using nickel export, we get a reasonable error message:

error: Type error
  ┌─ repl-input-3:8:16
  │
3 │   version = "0.1.1",
  │             ------- evaluated to this
  ·
8 │       "hello-#{version + 1}",
  │                ^^^^^^^ This expression has type Str, but Num was expected
  │
  = +, 1st argument

While dynamic typing is fine for configuration code, the trouble begins once we are using functions. Say we want to filter over a list of elements:

let filter = fun pred l =>
  lists.foldl (fun acc x => if pred x then acc @ [x] else acc) [] l in
filter (fun x => if x % 2 == 0 then x else null) [1,2,3,4,5,6]

Result:

error: Type error
  ┌─ repl-input-11:2:32
  │
2 │   lists.foldl (fun acc x => if pred x then acc @ [x] else acc) [] l in
  │                                ^^^^^^ This expression has type Num, but Bool was expected
3 │ filter (fun x => if x % 2 == 0 then x else null) [1,2,3,4,5,6]
  │                                                             - evaluated to this
  │
  = if

This example illustrates how dynamic typing delays type errors, making them harder to diagnose. Here, filter is fine, but the error still points to inside its implementation. The actual issue is that the caller provided an argument of the wrong type: the filtering function should return a boolean but returns either the original element (a number) or null. This is a tiny example, so debugging is still doable here. In a real code base, the user (who probably wouldn't even be the author of filter) might have a harder time solving the issue from the error report.

Static typing

The filter example is the poster child for static typing. The typechecker will catch the error early as the type expected by filter and the return type of the filtering function passed as the argument don't match .

To call the typechecker to the rescue, use : to introduce a type annotation. This annotation switches the typechecker on inside the annotated expression, be it a variable definition, a record field or any expression using an inline annotation. We will refer to such an annotated expression as a statically typed block.

Example:

// Let binding
let f : Num -> Bool = fun x => x % 2 == 0 in

// Record field
let r = {
  count : Num = 2354.45 * 4 + 100,
} in

// Inline
1 + ((if f 10 then 1 else 0) : Num)

Let us try on the filter example. We want the call to be inside the statically typechecked block. The easiest way is to capture the whole expression by adding a type annotation at the top-level:

(let filter = fun pred l =>
     lists.foldl (fun acc x => if pred x then acc @ [x] else acc) [] l in
filter (fun x => if x % 2 == 0 then x else null) [1,2,3,4,5,6]) : List Num

Result:

error: Incompatible types
  ┌─ repl-input-12:3:37
  │
3 │ filter (fun x => if x % 2 == 0 then x else null) [1,2,3,4,5,6]) : List Num
  │                                     ^ this expression
  │
  = The type of the expression was expected to be `Bool`
  = The type of the expression was inferred to be `Num`
  = These types are not compatible

This is already better! The error now points at the call site, and inside our anonymous function, telling us it is expected to return a boolean instead of a number. Notice how we just had to give the top-level annotation List Num. Nickel performs type inference, so that you don't have to write the type for filter, the filtering function nor the list.

Take-away

Nickel is gradually typed, meaning you can mix both static typing and dynamic typing. The default is dynamic typing. The static typechecker kicks in when using a type annotation exp : Type, which delimits a statically typed block.

Nickel also has type inference, sparing you writing unnecessary type annotations.

Type system

Let us now have a quick tour of the type system. The basic types are:

Dyn: the dynamic type. This is the type given to most expressions outside of a typed block. A value of type Dyn can be pretty much anything.
Num: the only number type. Currently implemented as a 64bits float.
Str: a string, which must always be valid UTF8.
Bool: a boolean, that is either true or false.

The following type constructors are available:

List: List T. A list of elements of type T. When no T is specified, List alone is an alias for List Dyn.

Example:
```
let x : List (List Num) = [[1,2], [3,4]] in
lists.flatten x : List Num
```
Record: {field1: T1, .., fieldn: Tn}. A record whose field names are known statically as field1, .., fieldn, respectively of type T1, .., Tn.

Example:
```
let pair : {fst: Num, snd: Str} = {fst = 1, snd = "a"} in
pair.fst : Num
```
Dynamic record: {_: T}. A record whose field names are statically unknown but are all of type T. Typically used to model dictionaries.

Example:
```
let occurences : {_: Num} = {a = 1, b = 3, c = 0} in
records.map (fun char count => count + 1) occurences : {_ : Num}
```
Enum: <tag1, .., tagn>: an enumeration comprised of alternatives tag1, .., tagn. An enumeration literal is prefixed with a backtick and serialized as a string. It is useful to encode finite alternatives. The advantage over strings is that the typechecker handles them more finely: it is able to detect incomplete matches, for example.

Example:
```
let protocol : <http, ftp, sftp> = `http in
(switch {
  `http => 1,
  `ftp => 2,
  `sftp => 3
} protocol) : Num
```
Arrow (function): S -> T. A function taking arguments of type S and returning a value of type T. For multi-parameters functions, just iterate the arrow constructor.

Example:
```
{
  incr : Num -> Num = fun x => x + 1,
  mkPath : Str -> Str -> Str -> Str = fun basepath filename ext =>
    "#{basepath}/#{filename}.#{ext}",
}
```

Polymorphism

Type polymorphism

Usually, a function like filter would be defined in a library. In this case, it is good practice to write a type annotation for it, if only to provide the consumers of this library with an explicit interface. What should be the type annotation for filter?

In our initial filter example, we are filtering on a list of numbers. But the code of filter is agnostic with respect to the type of elements of the list. That is, filter is generic. Genericity is expressed in Nickel through polymorphism. A polymorphic type is a type that contains the keyword forall, which introduces type variables that can later be substituted for any concrete type. Here is our polymorphic type annotation for filter:

{
  filter : forall a. (a -> Bool) -> List a -> List a = ...,
}

Now, filter can be used on numbers as in our initial example, but on strings as well:

{
  foo : List Str = filter (fun s => strings.length s > 2) ["a","ab","abcd"],
  bar : List Num = filter (fun x => if x % 2 == 0 then x else null) [1,2,3,4,5,6],
}

You can use as many parameters as you need:

let fst : forall a b. a -> b -> a = fun x y => x in
let snd : forall a b. a -> b -> b = fun x y => y in
{ n = fst 1 "a", s = snd 1 "a" } : {n: Num, s: Str}

Or even nest them:

let higherRankId : forall a. (forall b. b -> b) -> a -> a
  = fun id x => id x in
let id : forall a. a -> a
  = fun x => x in
higherRankId id 0 : Num

Type inference and polymorphism

If we go back to our first example of the statically typed filter without the polymorphic annotation and try to add a call to filter on a list of strings, the typechecker surprisingly rejects our code:

(let filter = ... in
let result = filter (fun x => x % 2 == 0) [1,2,3,4,5,6] in
let dummy = filter (fun s => strings.length s > 2) ["a","ab","abcd"] in
result) : List Num

Result:

error: Incompatible types
  ┌─ repl-input-35:2:37
  │
2 │ let dummy = filter (fun s => strings.length s > 2) ["a","ab","abcd"] in
  │                                             ^ this expression
  │
  = The type of the expression was expected to be `Str`
  = The type of the expression was inferred to be `Num`
  = These types are not compatible

The reason is that without an explicit polymorphic annotation, the typechecker will always infer non-polymorphic types. If you need polymorphism, you have to write a type anntation. Here, filter is inferred to be of type (Num -> Bool) -> List Num -> List Num, guessed from the application in the right hand side of result.

Note: if you are a more type-inclined reader, you may wonder why the typechecker is not capable of inferring a polymorphic type for filter by itself. Indeed, Hindley-Milner type-inference can precisely infer heading foralls, such that the previous rejected example would be accepted. We chose to abandon this so-called automatic generalization, because doing so just makes things simpler with respect to the implementation, the design and the extensibility of the language and the type system. Requiring annotation of polymorphic functions seems like a good practice and a small price to pay in return, in a non type-heavy configuration language like Nickel.

Row polymorphism

In a configuration language, you will often find yourself handling records of various kinds. In a simple type system, you can hit the following issue:

(let addTotal: {total: Num} -> {total: Num} -> Num
  = fun r1 r2 => r1.total + r2.total in
let r1 = {jan = 200, feb = 300, march = 10, total = jan + feb} in
let r2 = {aug = 50, sept = 20, total = aug + sept} in
let r3 = {may = 1300, june = 400, total = may + june} in
{
  partial1 = addTotal r1 r2,
  partial2 = addTotal r2 r3,
}) : {partial1: Num, partial2: Num}

error: Type error: extra row `sept`
  ┌─ repl-input-40:8:23
  │
8 │   partial2 = addTotal r2 r3,
  │                       ^^ this expression
  │
  = The type of the expression was expected to be `{total: Num}`, which does not contain the field `sept`
  = The type of the expression was inferred to be `{total: Num, sept: Num, aug: Num}`, which contains the extra field `sept`

The problem here is that for this code to run fine, the requirement of addTotal should be that both arguments have a field total: Num, but could very well have other fields, for all we care. Unfortunately, we don't know right now how to express this constraint. The current annotation is too restrictive, because it imposes that arguments have exactly one field total: Num, and nothing more.

To express such constraints, Nickel features row polymorphism. The idea is similar to polymorphism, but instead of substituting a parameter for a single type, we can substitute a parameter for a whole sequence of field declarations, also referred to as rows:

(let addTotal: forall a b. {total: Num | a} -> {total: Num | b} -> Num
  = fun r1 r2 => r1.total + r2.total in
let r1 = {jan = 200, feb = 300, march = 10, total = jan + feb} in
let r2 = {aug = 50, sept = 20, total = aug + sept} in
let r3 = {may = 1300, june = 400, total = may + june} in
{
  partial1 = addTotal r1 r2,
  partial2 = addTotal r2 r3,
}) : {partial1: Num, partial2: Num}

Result:

{partial1 = 570, partial2 = 1770}

In the type of addTotal, the part {total: Num | a} expresses exactly what we wanted: the argument must have a field total: Num, but the tail (the rest of the record type) is polymorphic, and a may be substituted for arbitrary fields (such as jan: Num, feb: Num). We used two different generic parameters a and b, to express that the tails of the arguments may differ. If we used a in both places, as in forall a. {total: Num | a} -> {total: Num | a} -> Num, we could still write addTotal {total = 1, foo = 1} {total = 2, foo = 2} but not addTotal {total = 1, foo = 1} {total = 2, bar = 2}. Using distinct parameters a and b gives us maximum flexibility.

What comes before the tail may include several fields, is in e.g. forall a. {total: Num, subtotal: Num | a} -> Num.

Row types can appear in the result of the function as well. The following example returns a new version of the input where fields a and b have been summed, without modifying the rest:

let sum : forall r. {a : Num, b : Num | r} -> {a : Num, b : Num, sum : Num | r}
        = fun x => x $[ "sum" = x.a + x.b]
in sum {a = 1, b = 2, c = 3} // {a=1, b=2, sum=3, c=3}

Note that row polymorphism also works with enums, with the same intuition of a tail that can be substituted for something else. For example:

let portOf : forall a. <http, ftp | a> -> Num = fun protocol =>
switch {
  `http -> 80,
  `ftp -> 21,
  _ -> 8000,
} protocol

Because the switch statement has a catch-all case _, this function is indeed able to handle other tags than http and ftp, as expressed by its polymorphic type.

Take-away

The type system of Nickel has usual basic types (Dyn, Num, Str, and Bool) and type constructors for lists, records, enums and functions. Nickel features generics via polymorphism, introduced by the forall keyword. A type can not only be generic in other types, but records and enums types can also be generic in their tail. The tail is delimited by |.

Interaction between statically typed and dynamically typed code

In the previous section, we've been focusing solely on the static typing side. We'll now explore how typed and untyped code interact.

Using statically typed code inside dynamically code

Until now, we have written the statically typed filter examples using statically typed blocks that enclosed both the definition of filter and the call sites. More realistically, filter would be a statically typed library function (it is actually part of the standard library as lists.filter) and likely be called from dynamically typed configuration files. In this situation, the call site escapes the typechecker. Thus, without an additional mechanism, static typing would only ensure that the implementation of filter doesn't violate the typing rules, but wouldn't prevent an ill-formed call from dynamically typed code. At first sight, static typing hasn't solved the original issue of delayed dynamic type errors at all! Remember, the typical problem is the caller passing a value of the wrong type that eventually raises an error from within filter.

Fortunately, Nickel does have a mechanism to prevent this from happening and to provide good error reporting in this situation. Let us see that by ourselves by calling to the statically typed lists.filter from dynamically typed code:

lists.filter (fun x => if x % 2 == 0 then x else null) [1,2,3,4,5,6]

Result:

error: Blame error: contract broken by the caller.
  ┌─ :1:17
  │
1 │ forall a. (a -> Bool) -> List a -> List a
  │                 ---- expected return type of a function provided by the caller
  │
  ┌─ repl-input-45:1:67
  │
1 │ lists.filter (fun x => if x % 2 == 0 then x else null) [1,2,3,4,5,6]
  │                                                                   - evaluated to this expression
  │
  = This error may happen in the following situation:
    1. A function `f` is bound by a contract: e.g. `(Num -> Num) -> Num`.
    2. `f` takes another function `g` as an argument: e.g. `f = fun g => g 0`.
    3. `f` is called by with an argument `g` that does not respect the contract: e.g. `f (fun x => false)`.
  = Either change the contract accordingly, or call `f` with a function that returns a value of the right type.
  = Note: this is an illustrative example. The actual error may involve deeper nested functions calls.

note:
    ┌─ <stdlib/lists>:160:14
    │
160 │     filter : forall a. (a -> Bool) -> List a -> List a
    │              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ bound here

[...]
note:
  ┌─ repl-input-45:1:1
  │
1 │ lists.filter (fun x => if x % 2 == 0 then x else null) [1,2,3,4,5,6]
  │ -------------------------------------------------------------------- (3) calling <func>

We call filter from a dynamically typed location, but still get a spot-on error. To precisely avoid dynamically code injecting values of the wrong type inside statically typed blocks via function calls, the interpreter protects said blocks by a contract. Contracts form a principled runtime verification scheme. Please refer to the dedicated manual section for more details, but for now, you can just remember that any type annotation (wherever it is) gives rise at runtime to a corresponding contract application. In other words, foo: T and foo | T (here | is contract application, not the row tail separator) behave exactly the same at runtime.

Thanks to this guard, you can statically type your library functions and use them from dynamically typed code while still enjoying good error messages.

Using dynamically typed code inside statically typed code

In the other direction, we face a different issue. Because dynamically typed code just get assigned the Dyn type most of the time, we can't use a dynamically typed value inside a statically typed block directly:

let x = 0 + 1 in
(1 + x : Num)

Result:

error: Incompatible types
  ┌─ repl-input-6:1:6
  │
1 │ (1 + x : Num)
  │      ^ this expression
  │
  = The type of the expression was expected to be `Num`
  = The type of the expression was inferred to be `Dyn`
  = These types are not compatible

We could add a type annotation to x. But sometimes we don't want to, or we can't. Maybe in a real use-case, x is an expression that we know correctly evaluates to a number but is rejected by the typechecker because it uses dynamic idioms. In this case, we can trade a type annotation for a contract application:

Example:

let x | Num = if true then 0 else "a" in
(1 + x : Num)

Here, x is clearly always a number, but it is not well-typed (the then and else branches of an if must have the same type). Nonetheless, this program is accepted! Because we inserted a contract application, the typechecker can be sure that if x is not a number, the program will fail early with a detailed contract error. Thus, if we reach 1 + x, at this point x is necessarily a number and won't cause any type mismatch. In a way, the contract application acts like a type cast, but whose verification is delayed to run-time.

Dually to a static type annotation, a contract application also turns the typechecker off again. You are back in the dynamic world. This is illustrated by the following program being accepted, where we inlined x inside the statically typed block:

(1 + ((if true then 0 else "a" | Num)) : Num

While a fully statically typed version is rejected because of the type mismatch between branches:

(1 + (if true then 0 else "a")) : Num

Result:

error: Incompatible types
  ┌─ repl-input-46:1:27
  │
1 │ (1 + (if true then 0 else "a")) : Num
  │                           ^^^ this expression
  │
  = The type of the expression was expected to be `Num`
  = The type of the expression was inferred to be `Str`
  = These types are not compatible

Apparent type

As a side note, annotations are not always needed to use dynamically typed code inside a statically typed block. The following example is accepted:

let x = 1 in
(1 + x : Num)

The typechecker tries to respect the intent of the programmer. If one doesn't use annotations, then the code shouldn't be typechecked, whatever the reason is. If you want x to be statically typed, you should annotate it.

That being said, the typechecker still avoids being too rigid: it is obvious in the previous example case that 1 is of type Num. This information is cheap to gather. When encountering a binding outside of a typed block, the typechecker determines the apparent type of the definition. The rationale is that determining the apparent type shouldn't recurse arbitrarily inside the expression or do anything non-trivial. Typically, replacing 1 with a compound expression 0 + 1 changes the type of x type to Dyn and makes the example fail. For now, the typechecker determines an apparent type that is not Dyn only for literals (numbers, strings, booleans), lists, variables, imports and annotated expressions. Otherwise, the typechecker fallbacks to Dyn. It may do more in the future (assign Dyn -> Dyn to functions, {_: Dyn} to records, etc).

Take-away

When calling to typed code from untyped code, Nickel automatically inserts contract checks at the boundary to enjoy clearer and earlier error reporting. In the other direction, an expression exp | Type is blindly accepted to be of type Type by the typechecker. This is a way of using untyped values inside typed code by telling the typechecker "trust me on this one, and if I'm wrong there will be a contract error anyway". While a type annotation switches the typechecker on, a contract annotation switches it back off.

Using contracts as types

Type annotations and contracts share the same syntax. This means that you can technically use custom contracts as any other type inside a static type annotation.

let Port = contracts.from_predicate (fun value =>
  builtins.is_num value
  && value % 1 == 0
  && value >= 0
  && value <= 65535) in

(10 - 1 : #Port)

But this program is unfortunately rejected by the typechecker:

Result:

error: Incompatible types
  ┌─ repl-input-0:7:2
  │
7 │ (10 : #Port)
  │  ^^ this expression
  │
  = The type of the expression was expected to be `#Port`
  = The type of the expression was inferred to be `Num`
  = These types are not compatible

It turns out statically ensuring that an arbitrary expression will eventually respects an arbitrary user-written predicate is a really hard problem even in simple cases (technically, it is even undecidable in the general case). The typechecker doesn't have a clue about the relation between numbers and ports. So, what can it do with annotations like #Port? There is one situation when the typechecker can be sure that something will eventually be a port number, or will fail with the correct error message: when using a contract application.

(let p | #Port = 10 - 1 in
 let id = fun x => x in
 id p
) : #Port

A custom contract hence acts like an opaque type (sometimes called abstract type as well) for the typechecker. The typechecker doesn't really know much about it except that the only way to construct a value of type #Port is to use contract application. You also need an explicit contract application to cast back a #Port to a Num: (p | Num) + 1 : Num.

Because of the rigidity of opaque types, using custom contracts inside static type annotations is not very useful right now. We just had to give them a reasonable meaning at typechecking time because types and contracts share the same specification syntax, and they can thus appear inside types.

Typing in practice

When to use type annotation, a contract application, or none of those? This is what the guide Type versus contracts: when to? is for.

26 KiB Raw Blame History

Typing in Nickel

Introduction

Typing modes

Dynamic typing

Static typing

Take-away

Type system

Polymorphism

Type polymorphism

Type inference and polymorphism

Row polymorphism

Take-away

Interaction between statically typed and dynamically typed code

Using statically typed code inside dynamically code

Using dynamically typed code inside statically typed code

Take-away

Using contracts as types

Typing in practice

26 KiB

Raw Blame History