This changes the `decl_ctx` to be toplevel only, with flattened references to
uids for most elements. The module hierarchy, which is still useful in a few
places, is kept separately.
Module names are also changed to UIDs early on, and support for module aliases
has been added (needs testing).
This resolves some issues with lookup, and should be much more robust, as well
as more convenient for most lookups.
The `decl_ctx` was also extended for string ident lookups, which avoids having
to keep the desugared resolution structure available throughout the compilation
chain.
rather than scattered in structures
The context is still hierarchical for defs though, so one needs to retrieve the
path to lookup in the correct context for info. Exceptions are enums and struct
defs, which are re-exposed at toplevel.
(first working dynload test with compilation done by manual calls to ocaml)
A few pieces of the puzzle:
* Loading of interfaces only from Catala files
* Registration of toplevel values in modules compiled to OCaml, to allow access
using dynlink
* Shady conversion from OCaml runtime values to/from Catala expressions, to
allow interop (ffi) of compiled modules and the interpreter
Two interdependent changes here:
1. Enforce all instances of Shared_ast.gexpr to use the generic type for marks.
This makes the interfaces a tad simpler to manipulate: you now write
`('a, 'm) gexpr` rather than `('a, 'm mark) gexpr`.
2. Define a polymorphic `Custom` mark case for use by pass-specific annotations.
And leverage this in the typing module
The phantom polymorphic variant qualifying AST nodes is reversed:
- previously, we were explicitely restricting each AST node to the passes where it belonged using a closed type (e.g. `[< dcalc | lcalc]`)
- now, each node instead declares the "feature" it provides using an open type (e.g. `[> 'Exceptions ]`)
- then the AST for a specific pass limits the features it allows with a closed type
The result is that you can mix and match all features if you wish,
even if the result is not a valid AST for any given pass. More
interestingly, it's now easier to write a function that works on
different ASTs at once (it's the inferred default if you don't write a
type restriction).
The opportunity was also taken to simplify the encoding of the
operators, which don't need a second type parameter anymore.
Many changes got bundled in here and would be too tedious to separate.
Closes#330
See changes in `shared_ast/definitions.ml` to check the main point.
- the biggest change is a modification of the struct and enum types in
expressions: they are now stored as `Map`s throughout passes, and no longer
converted to indexed lists after scopelang. Their accessors are also changed,
and tuples only exist in Lcalc (they're used for closure conversion).
This implied adding some more information in the contexts, to keep the mapping
between struct fields and scope output variables. It should also be much more
robust (no longer relying on assumptions upon different orderings).
- another very pervasive change is more cosmetic: the rewrite of the main AST to
use inline records, labelling individual subfields.
- moved the checks for correct definitions and accesses of structures from
`Scope_to_dcalc` to `Typing`
- defining some new shallow iterators in module `Shared_ast.Expr`, and
factorising a few same-pass rewriting functions accordingly (closure
conversion, optimisations, etc.)
- some smaller style improvements (ensuring we use the proper compare/equal
functions instead of `=` in a few `when` closes, for example)
This was the only reasonable solution I found to the issue raised
[here](https://github.com/CatalaLang/catala/pull/334#discussion_r987175884).
This was a pretty tedious rewrite, but it should now ensure we are doing things
correctly. As a bonus, the "smart" expression constructors are now used
everywhere to build expressions (so another refactoring like this one should be
much easier) and this makes the code overall feel more
straightforward (`Bindlib.box_apply` or `let+` no longer need to be visible!)
---
Basically, we were using values of type `gexpr box = naked_gexpr marked box`
throughout when (re-)building expressions. This was done 99% of the time by
using `Bindlib.box_apply add_mark naked_e` right after building `naked_e`. In
lots of places, we needed to recover the annotation of this expression later on,
typically to build its parent term (to inherit the position, or build the type).
Since it wasn't always possible to wrap these uses within `box_apply` (esp. as
bindlib boxes aren't a monad), here and there we had to call `Bindlib.unbox`,
just to recover the position or type. This had the very unpleasant effect of
forcing the resolution of the whole box (including applying any stored closures)
to reach the top-level annotation which isn't even dependant on specific
variable bindings. Then, generally, throwing away the result.
Therefore, the change proposed here transforms
- `naked_gexpr marked Bindlib.box` into
- `naked_gexpr Bindlib.box marked` (aliased to `boxed_gexpr` or `gexpr boxed` for
convenience)
This means only
1. not fitting the mark into the box right away when building, and
2. accessing the top-level mark directly without unboxing
The functions for building terms from module `Shared_ast.Expr` could be changed
easily. But then they needed to be consistently used throughout, without
manually building terms through `Bindlib.apply_box` -- which covers most of the
changes in this patch.
`Expr.Box.inj` is provided to swap back to a box, before binding for example.
Additionally, this gives a 40% speedup on `make -C examples pass_all_tests`,
which hints at the amount of unnecessary work we were doing --'
Handling code should now be reasonably well sorted between `Shared_ast.{Var,Expr,Scope,Program}`
The function parameters (e.g. `make_let_in`) could be removed from the
scope handling functions since now the types are compatible, which
makes them much easier to read.