Now, when we have two aliases like
```
T a : [ A, B (U a) ]
U a : [ C, D (T a) ]
```
during the first pass, we simply canonicalize them but add neither to
the scope. This means that `T` will not be instantiated in the
definition of `U`. Only in the second pass, during correction, do we
instantiate both aliases **independently**:
```
T a : [ A, B [ C, D (T a) ] ]
U a : [ C, D [ A, B (U a) ] ]
```
and now we can mark each recursive, individually:
```
T a : [ A, B [ C, D <rec1> ] ] as <rec1>
U a : [ C, D [ A, B <rec2> ] ] as <rec2>
```
This means that the surface types shown to users might be a bit larger,
but it has the benefit that everything needed to understand a layout of
a type in later passes is stored on the type directly, and we don't need
to keep alias mappings.
Since we sort by connected components, this should be complete.
Closes#2458
This work is related to restricting tag union sizes in input positions.
As an example, for something like
```
\x -> when x is
A M -> X
A N -> X
A _ -> X
```
we'd like to infer `[A [M, N]* ]` rather than the `[A, [M, N]* ]*` we
infer today. Notice the difference is that the former type tells us we
only accepts `A`s, but the argument of the `A` can be `M`, `N` or
anything else (hence the `_`).
So what's the idea? It's an encoding of the "must have"/"might have"
design discussed in https://github.com/rtfeldman/roc/issues/1758. Let's
take our example above and walk through unification of each branch.
Suppose `x` starts off as a flex var `t`.
```
\x -> when x is
A M -> X
```
Now we introduce a new kind of constraint called a "presence"
constraint. It says "t has at least [A [M]]". I'll notate this as `t +=
[A [M]]`. When `t` is free as it is here, this is equivalent to `t ~
[A [M]]`.
```
\x -> when x is
...
A N -> X
```
At this branch we introduce the presence constraint `[A [M]] += [A [N]]`.
Notice that there's two tag unions we care about resolving here - one is
the toplevel one that says "I have an `A ...` inside of me", and the
other one is the tag union that's the tyarg to `A`. They are distinct
and at different depths.
For the toplevel one, we first figure out if the number of tags in the
union needs to expand. It does not - we're hoping to resolve the type
`[A [M, N]]`, which only has `A` in the toplevel union. So, we don't
need to do anything extra there, other than the merge the nested tag
unions.
We recurse on the shared tags, and now we have the presence constraint
`[M] += [N]`. At this point it's important to remember that the left and
right hand types are backed by type variables, so this is really
something like `t11 [M] += t12 [N]`, where `[M]` and `[N]` are just what
we know the variables `t11` and `t12` to be at this moment. So how do we
solve for `t11 [M, N]` from here? Well, we can encode this constraint as
a type variable definition and a unification constraint we already know
how to solve:
```
New definition: t11 [M]a (a fresh)
New constraint: a ~ t12 [N]
```
That's it; upon unification, `t11 [M, N]` falls out.
Okay, last step.
```
\x -> when x is
...
A _ -> X
```
We now have `[A [M, N]] += [A a]`, where `a` is a fresh unbound
variable. Again nothing has to happen on the toplevel. We walk down and
find `t11 [M, N] += t21 a`. This is actually called an "open constraint"; we
differentiate it at the time we generate constraints because it follows
syntactically from the presence of an `_`, but it's semantically
equivalent to the presence constraint `t11 [M, N] += t21 a`. It's just
called opening because literally the only way `t11 [M, N] += t21 a` can
be true is if we set `t11 a`. Well, actually, we assume `a` is a tag
union, so we just make `t11` the open tag union `[M, N]a`. Since `a` is
unbound, this eventually becomes a wildcard and hence falls out `[M, N]*`.
Also, once we open a tag union with an open constraint, we never close
it again.
That's it. The rest falls out recursively. This gives us a really easy
way to encode these ordering constraints in the unification-based system
we have today with minimal additional intervention. We do have to patch
variables in-place sometimes, and the additive nature of these
constraints feels about out-of-place relative to unification, but it
seems to work well.
Resolves#1758
By emitting a runtime error rather than panicking when we can't layout
a record, we help programs like
```
main =
get = \{a} -> a
get {b: "hello world"}
```
execute as
```
Mismatch in compiler/unify/src/unify.rs Line 1071 Column 13
Trying to unify two flat types that are incompatible: EmptyRecord ~ { 'a' : Demanded(122), }<130>
🔨 Rebuilding host...
── TYPE MISMATCH ───────────────────────────────────────────────────────────────
The 1st argument to get is not what I expect:
8│ get {b: "hello world"}
^^^^^^^^^^^^^^^^^^
This argument is a record of type:
{ b : Str }
But get needs the 1st argument to be:
{ a : a }b
Tip: Seems like a record field typo. Maybe a should be b?
Tip: Can more type annotations be added? Type annotations always help
me give more specific messages, and I think they could help a lot in
this case
────────────────────────────────────────────────────────────────────────────────
'+fast-variable-shuffle' is not a recognized feature for this target (ignoring feature)
'+fast-variable-shuffle' is not a recognized feature for this target (ignoring feature)
Done!
Application crashed with message
Can't create record with improper layout
Shutting down
```
rather than the hanging
```
Mismatch in compiler/unify/src/unify.rs Line 1071 Column 13
Trying to unify two flat types that are incompatible: EmptyRecord ~ { 'a' : Demanded(122), }<130>
thread '<unnamed>' panicked at 'invalid layout from var: UnresolvedTypeVar(104)', compiler/mono/s
rc/layout.rs:1510:52
```
that was previously produced.
Part of #2227
This reverts commit 6e4fd5f06a1ae6138659b0073b4e2b375a499588.
This idea didn't work out because cloning the type and storing it on a
variable still resulted in the solver trying to uify the variable with
the type. When there were errors, which there certainly would be if we
tried to unify the variable with a structure that had nested flex/rigid
vars, the nested flex/rigid vars would inherit those errors, and the
program wouldn't typecheck.
Since the motivation here was to expose the signature type to
`reporting` so that we could modify it with suggestions, we should
instead pass that information along in something analogous to the
`Expected` struct.
To provide better error messages and suggestions related to changing
type annotations, we now pass annotation type signatures all the way
down through the constraint solver. At constraint generation we
associate the type signature with a unique variable, and during error
reporting, we pull out an `ErrorType` corresponding to the original type
signature, by looking up the unique variable. This gives us two nice
things:
1. It means we don't have to pass the original, AST-like type
annotation, which can be quite large, to everyone who looks at an
expectation.
2. It gives us a translation from a `Type` to an `ErrorType` for free
using the existing translation procedure in `roc_types::subs`,
without having to create a new translation function.