1 Best practices for developing and refactoring data types in Unison
Rúnar edited this page 2019-11-11 16:51:48 -05:00
This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

The fact that Unison code is not text changes some things about how we work with data types.

Lets say we have some data type to support the business, and we want to change it. Say it starts out as:

type ShoppingCart = Cart [Item]

And we want to evolve its representation to:

type ShoppingCart = Cart (Map Item Nat)

Before Unison

In a traditional language with algebraic data types and pattern matching, such as Haskell or Scala, we would have to

  1. Mutate the text file that defines the ShoppingCart data type.
  2. Manually change all the places where the Cart constructor is called (also by text file mutation).
  3. Manually change all functions that pattern match on ShoppingCart.
  4. Recompile and publish a new version of our data type.

Note that when users of our type update to the latest version, any code of theirs that uses Cart as either a constructor or pattern will be broken. Their code will no longer compile or work. They need to manually go through such code and change it.

For example, this operation (Unison notation)…

use ShoppingCart

addItem : Item -> ShoppingCart -> ShoppingCart
addItem i = case i of
 Cart items -> Cart (i +: items)

needs to change to something like…

use ShoppingCart

addItem : Item -> ShoppingCart -> ShoppingCart
addItem i = case i of
  Cart bag -> 
    previousCount = 0 `orDefault` Map.lookup i bag
    Cart (Map.insert i (previousCount + 1) bag)

Names are magic

Notably, any function that calls addItem, but doesnt itself use the Cart pattern or call the Cart constructor, gets updated for free, simply by virtue of the fact that the new ShoppingCart type has the same name as the old one.

The same thing happens (in Haskell and Scala) with constructors whose names and types dont change. Consider a type like:

type Element = Earth | Air | Fire | Water

And say we need to change it by e.g. adding a new constructor:

type Element = Earth | Air | Fire | Water | Leeloo

A lot of functions that use the constructors of the old type will be magically upgraded to the new one:

  1. Functions that construct Element by calling one of its constructors.
  2. Functions that pattern match on Element and have a catch-all pattern.

We have to manually change any pattern-matching functions on Element that don't have a catch-all pattern.

Migrating data types in Unison

Back to our original example, in Unison this time. We want to go from...

type ShoppingCart = Cart [Item]

to…

type ShoppingCart = Cart (Map Item Nat)

To make this change in Unison, we have to:

  1. Write the new ShoppingCart type (possibly by issuing an edit of the old one from the Codebase Manager).
  2. Issue an update in the Codebase Manager.
  3. Manually work through our edit frontier to change any functions that call the Cart constructor or use the Cart pattern.
  4. Publish our updated code as a patch.

In step 2, the Codebase Manager walks through the codebase and transitively updates any dependents of the old ShoppingCart to use the new type. It can do this as long as:

  1. The usage site doesnt use the Cart constructor or pattern directly.
  2. The updated function still typechecks.

In step 4, we end up with a patch which performs this same kind of update on any user code that it gets applied to.

Note here that users of our type who update to the latest version will have to manually go through any code that uses Cart as either a constructor or pattern, just like in Haskell or Scala, if they want that code to use the latest version of ShoppingCart.

But no code will be broken. Everyones old code still works, as its still using the old type, which can happily coexist with the new one since it has a different hash. Some users may simply opt to write conversions between the two types instead of doing the migration.

When only some constructors change

Its common to have a large data type (with maybe dozens of data constructors), where we want to change the argument types for one of the constructors.

A simplified example:

type Shape = Rectangle Float Float
           | Circle Float

We might want to change that to:

type Shape = Rectangle Float Float
           | Ellipse Float Float

In Haskell or Scala, the workflow usually goes like this:

  1. Mutate the file that defines the data type, changing the Circle constructor to an Ellipse constructor.
  2. Rebuild the project and see what breaks.
  3. Go through every location where the code is broken and fix it.
  4. Publish the new code and let downstream users repeat steps 2 and 3 on their codebases.

Note that somewhere between steps 1 and 2 above, a new data type Shape is created, and some of the old code still compiles, incidentally, because the data constructors have the same names even though those names now refer to constructors of an entirely new type (that happens to have the same name). Thus code that only uses Rectangle will be fine, but code that uses Circle will be broken.

In Unison, we cant really have this workflow, because a Unison codebase can't be in a broken state. Without a general metaprogramming facility (which Unison will have some day), our only option is to update every place that uses the constructors of the old Shape type to use the new Shape type instead.

Proposals to remedy this follow.

Proposal: A codebase API for Unison

See A Unison API for managing a Unison codebase · Issue #922 · unisonweb/unison · GitHub

Proposal: Enlist the help of a text editor

See Upgrading a large data type · unisonweb/unison Wiki · GitHub

Proposal: Unison data types are true unions

Best practice: encourage your users to use smart constructors

The tedium of manually going through functions that call constructors on obsolete types can be alleviated somewhat by using smart constructors. A smart constructor is just a function that calls through to the actual constructor:

use ShoppingCart

cart : [Item] -> ShoppingCart
cart = Cart

You can migrate your smart constructors to the new type:

use ShoppingCart

cart : [Item] -> ShoppingCart
cart = foldr addItem (Cart Map.empty)

Any users of the smart constructor will have their code updated to call the new one when they apply our patch. Since the new constructor has the same type as the old one, any usage sites will still typecheck so Unison can perform this update automatically.

If the data type changes in such a way that a smart constructors type has to change, Unison wont be able to migrate a call to that particular smart constructor.

Best practice: use folds instead of patterns

The same idea holds for manually updating patterns. Its a good idea to provide pattern-matching over a type, or a given recursion scheme over it, in a single place.

foldCart : ([Item] -> a) -> ShoppingCart -> a
foldCart f c = case c of 
  Cart items -> f items

It may help with migration if users can replace the old foldCart with a new version that uses the new type:

foldCart : ([Item] -> a) -> ShoppingCart -> a
foldCart f c = case c of
  Cart bag ->
    go item count = replicate count item 
    f (flatMap go (Map.toList bag))