Turns out that we can't always assume that a successful unification of
type alias type variables means that those aliases had the same real
type from the start. Because type variables may contain unbound type
variables and grow during their unification (for example,
`[InvalidNumStr]a ~ [ListWasEmpty]b` unify to give `[InvalidNumStr,
ListWasEmpty]`), the real type may grow as well.
For this reason, continue to explicitly unify alias real types for now.
We can get away with not having to do so when the type variable
unification causes no changes to the unification tree at all, but we
don't have a great way to detect that right now (maybe snapshots?)
Closes#2583
This ensures that we use the correct specialized variable at the call
site of a function. In #2725 what happened was that a generalized
function was aliased, causing it to undergo generalization again. Then,
we lost the variable used to specialize at the call site. Instead, just
link to the partial proc being aliased directly.
There is an added benefit here, which is that we can avoid the
possibly-quadratic replacement of symbols in the generated statement.
Closes#2725
Closes#2535
See the referenced issue for longer discussion - here's the synopsis.
Consider this program
```
app "test" provides [ nums ] to "./platform"
alpha = { a: 1, b: 2 }
nums : List U8
nums =
[
alpha.a,
alpha.b,
]
```
Here's its IR:
```
procedure : `#UserApp.alpha` {I64, U8}
procedure = `#UserApp.alpha` ():
let `#UserApp.5` : Builtin(Int(I64)) = 1i64;
let `#UserApp.6` : Builtin(Int(U8)) = 2i64;
let `#UserApp.4` : Struct([Builtin(Int(I64)), Builtin(Int(U8))]) = Struct {`#UserApp.5`, `#UserApp.6`};
ret `#UserApp.4`;
procedure : `#UserApp.nums` List U8
procedure = `#UserApp.nums` ():
let `#UserApp.7` : Struct([Builtin(Int(I64)), Builtin(Int(U8))]) = CallByName `#UserApp.alpha`;
let `#UserApp.1` : Builtin(Int(U8)) = StructAtIndex 1 `#UserApp.7`;
let `#UserApp.3` : Struct([Builtin(Int(I64)), Builtin(Int(U8))]) = CallByName `#UserApp.alpha`;
let `#UserApp.2` : Builtin(Int(U8)) = StructAtIndex 1 `#UserApp.3`;
let `#UserApp.0` : Builtin(List(Builtin(Int(U8)))) = Array [`#UserApp.1`, `#UserApp.2`];
ret `#UserApp.0`;
```
What's happening is that we need to specialize `alpha` twice - once for the
type of a narrowed to a U8, another time for the type of b narrowed to a U8.
We do the specialization for alpha.b first - record fields are sorted by
layout, so we generate a record of type {i64, u8}. But then we go to
specialize alpha.a, but this has the same layout - {i64, u8} - so we reuse
the existing one! So (at least for records), we need to include record field
order associated with the sorted layout fields, so that we don't reuse
monomorphizations like this incorrectly!
This is a bit.. ugly, or at least seems suboptimal, but I can't think of
a better way to do it currently aside from demanding a uniform
representation, which we probably don't want to do.
Another option is something like the defunctionalization we perform
today, except also capturing potential uses of nested functions in the
closure tag of an encompassing lambda. So for example,
```
f = \x -> \y -> 1
```
would now record a lambdaset with the data `[Test.f
[TypeOfInnerClos1]]`, where `TypeOfInnerClos1` is e.g.
`[Test.f.innerClos1 I8, Test.f.innerClos1 I16]`, symbolizing that the
inner closure may be specialized to take an I8 or I16. Then at the time
that we create the capture set for `f`, we create a tag noting what
specialization should be used for the inner closure, and apply the
current defunctionalization algorithm. So effectively, the type of the
inner closure becomes a capture.
I'm not sure if this is any better, or if it has more problems.
@folkertdev any thoughts?
Closes#2322
This commit fixes a long-standing bug wherein bindings to polymorphic,
non-function expressions would be lowered at binding site, rather than
being specialized at the call site.
Concretely, consider the program
```
main =
n = 1
idU8 : U8 -> U8
idU8 = \m -> m
idU8 n
```
Prior to this commit, we would lower `n = 1` as part of the IR, and the
`n` at the call site `idU8 n` would reference the lowered definition.
However, at the definition site, `1` has the polymorphic type `Num *` -
it is not until the the call site that we are able to refine the type
bound by `n`, but at that point it's too late. Since the default layout
for `Num *` is a signed 64-bit int, we would generate IR like
```
procedure main():
let App.n : Builtin(Int(I64)) = 1i64;
...
let App.5 : Builtin(Int(U8)) = CallByName Add.idU8 App.n;
ret App.5;
```
But we know `idU8` expects a `u8`; giving it an `i64` is nonsense.
Indeed this would trigger LLVM miscompilations later on.
To remedy this, we now keep a sidecar table that maps symbols to the
polymorphic expression they reference, when they do so. We then
specialize references to symbols on the fly at usage sites, similar to
how we specialize function usages.
Looking at our example, the definition `n = 1` is now never lowered to
the IR directly. We only generate code for `1` at each place `n` is
referenced. As a larger example, you can imagine that
```
main =
n = 1
asU8 : U8 -> U8
asU32 : U32 -> U8
asU8 n + asU32 n
```
is lowered to the moral equivalent of
```
main =
asU8 : U8 -> U8
asU32 : U32 -> U8
asU8 1 + asU32 1
```
Moreover, transient usages of polymorphic expressions are lowered
successfully with this approach. See for example the
`monomorphized_tag_with_polymorphic_arg_and_monomorphic_arg` test in
this commit, which checks that
```
main =
mono : U8
mono = 15
poly = A
wrap = Wrapped poly mono
useWrap1 : [Wrapped [A] U8, Other] -> U8
useWrap1 =
\w -> when w is
Wrapped A n -> n
Other -> 0
useWrap2 : [Wrapped [A, B] U8] -> U8
useWrap2 =
\w -> when w is
Wrapped A n -> n
Wrapped B _ -> 0
useWrap1 wrap * useWrap2 wrap
```
has proper code generated for it, in the presence of the polymorphic
`wrap` which references the polymorphic `poly`.
https://github.com/rtfeldman/roc/pull/2347 had a different approach to
this - polymorphic expressions would be converted to (possibly capturing) thunks.
This has the benefit of reducing code size if there are many polymorphic
usages, but may make the generated code slower and makes integration
with the existing IR implementation harder. In practice I think the
average number of polymorphic usages of an expression will be very
small.
Closes https://github.com/rtfeldman/roc/issues/2336
Closes https://github.com/rtfeldman/roc/issues/2254
Closes https://github.com/rtfeldman/roc/issues/2344
Previously we would expand optional record fields to assignments when
converting record patterns to "when" expressions. This resulted in
incorrect code being generated.
Figuring out what this module was doing, and why, took me a bit less
than half an hour. We should document what's happening for others in the
future so they don't need to follow up on Zulip necessarily.