Commit Graph

69 Commits

Author SHA1 Message Date
vbot
2b8d6676d5
Scopelang: add list of definitions's positions to scope var def 2024-07-17 18:03:23 +02:00
vbot
b2449f7b4c
Add multiple typing errors using delayed errors 2024-06-19 14:52:31 +02:00
Louis Gesbert
f04e889173 Pass the "external module" info along passes 2024-05-28 11:43:50 +02:00
Louis Gesbert
403156b36e Computation and checking of module hashes
This includes a few separate changes:

- pass visibility information of declarations (depending on wether the
  declaration was in a ```catala-metadata block or not)

- add reasonable hash computation functions to discriminate the interfaces. In
  particular:
  * Uids have a `hash` function that depends on their string, but not on their
    actual uid (which is not stable between runs of the compiler) ; the existing
    `hash` function and its uses have been renamed to `id`.
  * The `Hash` module provides the tools to properly combine hashes, etc. While
    we rely on `Hashtbl.hash` for the atoms, we take care not to use it on any
    recursive structure (it relies on a bounded traversal).

- insert the hashes in the artefacts, and properly check and report those (for
  OCaml)

**Remains to do**:

- Record and check the hashes in the other backends

- Provide a way to get stable inline-test outputs in the presence of module
  hashes

- Provide a way to write external modules that don't break at every Catala
  update.
2024-05-28 11:43:50 +02:00
Louis Gesbert
75bf768264 Reformat 2024-04-04 10:56:56 +02:00
Louis Gesbert
f151a9a794 Wip: handling subscope calls as normal definitions
- for the moment we still have specific vars for subscope inputs (in scopelang)
- this causes trouble with default-embedding of scope calls
2024-04-04 10:24:18 +02:00
Louis Gesbert
4eeb8221f4 Fix var bindings in desugared->scopelang 2024-04-04 10:24:18 +02:00
Louis Gesbert
7951661981 Turn subscope-vars into scope vars
They are to become citizens of the same class if we want to allow
output-subscopes (without unnecessary complications like deconstructing and
reconstructing the same structure). And it's reasonable to assume that they
share the same namespace.

With this we should shortly collapse the (internal) ambiguity between

- `subscope.subvar`: access to a variable within a subscope
- `subscope.subfield`: access to a field of the output structure contained in a
  subscope variable

With the subscope a variable, these should now become strictly equivalent, so
the plan is that the first could be removed.
2024-04-04 10:24:18 +02:00
Louis Gesbert
4cec981f62 Move global options of Cli to their own module
This resolves a dependency cycle that would forbid `Cli` from using the modue
`File`, which was annoying.
2024-03-19 15:18:35 +01:00
Louis Gesbert
a56d95d790 Typing: add a "assume operator types" mode
This allows for retyping after monomorphisation: a new function just extracts
the return type of the operator, without checking the operand types.

Also to avoid multiplying function arguments around the typer, the flags have
been gathered in a record that is included in the typing environment; it's ok to
give them default values as long as these are the strictest.
2024-02-07 17:41:04 +01:00
Denis Merigoux
5310e47e5b
Fix monomorphization problems with [TAny] left 2024-01-17 16:03:20 +01:00
Louis Gesbert
1ae955b504 Reformat 2023-11-30 23:53:38 +01:00
Louis Gesbert
3649f92975 Rework resolution of module elements
This changes the `decl_ctx` to be toplevel only, with flattened references to
uids for most elements. The module hierarchy, which is still useful in a few
places, is kept separately.

Module names are also changed to UIDs early on, and support for module aliases
has been added (needs testing).

This resolves some issues with lookup, and should be much more robust, as well
as more convenient for most lookups.

The `decl_ctx` was also extended for string ident lookups, which avoids having
to keep the desugared resolution structure available throughout the compilation
chain.
2023-11-30 21:14:12 +01:00
Louis Gesbert
c4715ea86e Reformat 2023-11-27 11:09:08 +01:00
Louis Gesbert
958aaebac3 Typing defaults fixes: keep in and out type in scope sigs 2023-11-27 11:06:16 +01:00
Louis Gesbert
f162f6e9bd Improve handling of module name definitions
and add some sanity-checks for consistency of used modules w.r.t. actually
loaded modules.
2023-09-27 13:14:03 +02:00
Denis Merigoux
9cecf5587a
Register surface syntax languge in program troughout the compilation chain 2023-09-22 18:05:26 +02:00
Louis Gesbert
544e18e110 Fixes related to environments and lookups 2023-08-31 18:31:48 +02:00
Louis Gesbert
7db63e5f78 Simplification: store paths in Uids
rather than scattered in structures

The context is still hierarchical for defs though, so one needs to retrieve the
path to lookup in the correct context for info. Exceptions are enums and struct
defs, which are re-exposed at toplevel.
2023-08-31 18:31:48 +02:00
Louis Gesbert
72882f82df Reformat 2023-08-31 17:55:36 +02:00
Louis Gesbert
9bac045d03 Implement module lookups for scopes, structs, and enums 2023-08-31 17:54:39 +02:00
Louis Gesbert
209be6b758 Improve integration of marks into the main AST
Two interdependent changes here:
1. Enforce all instances of Shared_ast.gexpr to use the generic type for marks.
   This makes the interfaces a tad simpler to manipulate: you now write
   `('a, 'm) gexpr` rather than `('a, 'm mark) gexpr`.
2. Define a polymorphic `Custom` mark case for use by pass-specific annotations.
   And leverage this in the typing module
2023-05-17 17:37:00 +02:00
Louis Gesbert
fc531777c0 Rework and normalise the Marked interface
The module is renamed to `Mark`, and functions renamed to avoid redundancy:

`Marked.mark` is now `Mark.add`
`Marked.unmark` is now `Mark.remove`
`Marked.map_under_mark` is now simply `Mark.map`
etc.

`Marked.same_mark_as` is replaced by `Mark.copy`, but with the arguments
swapped (which seemed more convenient throughout)

Since a type `Mark.t` would indicate a mark, and to avoid confusion, the type
`Marked.t` is renamed to `Mark.ed` as a shorthand for `Mark.marked` ; this part
can easily be removed if that's too much quirkiness.
2023-05-17 17:37:00 +02:00
Denis Merigoux
aa8ab3be3d
Merge branch 'master' into c_backend 2023-03-21 12:14:10 +01:00
Raphaël Monat
d5cd5b206a Show conflicting date rounding mode declarations when they happen 2023-03-16 18:51:01 +01:00
Raphaël Monat
7021c41f93 Add date rounding option within scopes 2023-03-16 16:55:55 +01:00
Louis Gesbert
c3af0b4097 Toplevel definitions: branch cleanup
- fix remaining warnings (mostly unused arguments)
- renamings throughout for consistency and clarity
2023-02-13 18:02:09 +01:00
Louis Gesbert
9b0c7583ec Add top-level definitions
Only handled until before scalc at the moment.
2023-02-13 11:43:49 +01:00
Denis Merigoux
d7df6b3e80
Typing now takes into account [TAny] in structs/enums 2023-02-08 16:00:53 +01:00
Denis Merigoux
c78a004b53
Leave everything unresolved for now 2023-02-08 16:00:53 +01:00
Louis Gesbert
660e5775de Rename utils to catala_utils 2022-11-28 16:38:09 +01:00
Louis Gesbert
b329afbbdb Rename all Map/Set calls accordingly
This is just a bunch of `sed` calls:
```shell
sed -i 's/ScopeSet/ScopeName.Set/g' compiler/**/*.ml*
sed -i 's/ScopeMap/ScopeName.Map/g' compiler/**/*.ml*
sed -i 's/StructMap/StructName.Map/g' compiler/**/*.ml*
sed -i 's/StructSet/StructName.Set/g' compiler/**/*.ml*
sed -i 's/EnumMap/EnumName.Map/g' compiler/**/*.ml*
sed -i 's/EnumSet/EnumName.Set/g' compiler/**/*.ml*
sed -i 's/StructFieldName/StructField/g' compiler/**/*.ml*
sed -i 's/StructFieldMap/StructField.Map/g' compiler/**/*.ml*
sed -i 's/StructFieldSet/StructField.Set/g' compiler/**/*.ml*
sed -i 's/EnumConstructorMap/EnumConstructor.Map/g' compiler/**/*.ml*
sed -i 's/EnumConstructorSet/EnumConstructor.Set/g' compiler/**/*.ml*
sed -i 's/RuleMap/RuleName.Map/g' compiler/**/*.ml*
sed -i 's/RuleSet/RuleName.Set/g' compiler/**/*.ml*
sed -i 's/LabelMap/LabelName.Map/g' compiler/**/*.ml*
sed -i 's/LabelSet/LabelName.Set/g' compiler/**/*.ml*
sed -i 's/ScopeVarMap/ScopeVar.Map/g' compiler/**/*.ml*
sed -i 's/ScopeVarSet/ScopeVar.Set/g' compiler/**/*.ml*
sed -i 's/SubScopeNameMap/SubScopeName.Map/g' compiler/**/*.ml*
sed -i 's/SubScopeNameSet/SubScopeName.Set/g' compiler/**/*.ml*
```

... and reformat
2022-11-28 16:38:09 +01:00
Louis Gesbert
47799ea24f Uniform naming of conversion modules across compilation passes 2022-11-22 12:08:18 +01:00
Louis Gesbert
4ae392c900 AST refactoring
Many changes got bundled in here and would be too tedious to separate.

Closes #330

See changes in `shared_ast/definitions.ml` to check the main point.

- the biggest change is a modification of the struct and enum types in
  expressions: they are now stored as `Map`s throughout passes, and no longer
  converted to indexed lists after scopelang. Their accessors are also changed,
  and tuples only exist in Lcalc (they're used for closure conversion).

  This implied adding some more information in the contexts, to keep the mapping
  between struct fields and scope output variables. It should also be much more
  robust (no longer relying on assumptions upon different orderings).

- another very pervasive change is more cosmetic: the rewrite of the main AST to
  use inline records, labelling individual subfields.

- moved the checks for correct definitions and accesses of structures from
  `Scope_to_dcalc` to `Typing`

- defining some new shallow iterators in module `Shared_ast.Expr`, and
  factorising a few same-pass rewriting functions accordingly (closure
  conversion, optimisations, etc.)

- some smaller style improvements (ensuring we use the proper compare/equal
  functions instead of `=` in a few `when` closes, for example)
2022-11-17 18:16:09 +01:00
Louis Gesbert
6e2c3eee4d Fix the regression on badly tagged scope variable error message
it actually simplifies the typer a little to not care about this specific error,
which is better handled in desugared_to_scope already.
2022-10-25 14:50:49 +02:00
Louis Gesbert
41d6d3cbe9 Make scopes directly callable
Quite a few changes are included here, some of which have some extra
implications visible in the language:

- adds the `Scope of { -- input_v: value; ... }` construct in the language

- handle it down the pipeline:
  * `ScopeCall` in the surface AST
  * `EScopeCall` in desugared and scopelang
  * expressions are now traversed to detect dependencies between scopes
  * transformed into a normal function call in dcalc

- defining a scope now implicitely defines a structure with the same name, with
  the output variables of the scope defined as fields. This allows us to type
  the return value from a scope call and access its fields easily.
  * the implications are mostly in surface/name_resolution.ml code-wise
  * the `Scope_out` struct that was defined in scope_to_dcalc is no longer
    needed/used and the fields are no longer renamed (changes some outputs; the
    explicit suffix for variables with multiple states is ignored as well)
  * one benefit is that disambiguation works just like for structures when there
    are conflicts on field names
  * however, it's now a conflict if a scope and a structure have the same
    name (side-note: issues with conflicting enum / struct names or scope
    variables / subscope names were silent and are now properly reported)

- you can consequently use scope names as types for variables as well. Writing
  literals is not allowed though, they can only be obtained by calling the
  scope.

Remaining TODOs:

- context variables are not handled properly at the moment

- error handling on invalid calls

- tests show a small error message regression; lots of examples will need
  tweaking to avoid scope/struct name or struct fields / output variable
  conflicts

- add a `->` syntax to make struct field access distinct from scope output var
  access, enforced with typing. This is expected to reduce confusion of users
  and add a little typing precision.

- document the new syntax & implications (tutorial, cheat-sheet)

- a consequence of the changes is that subscope variables also can now be typed.
  A possible future evolution / simplification would be to rewrite subscopes as
  explicit scope calls early in the pipeline. That could also allow to manipulate
  them as expressions (bind them in let-ins, return them...)
2022-10-21 17:17:26 +02:00
Louis Gesbert
f9f834e30a Add a helper to fold on expressions 2022-10-21 15:35:49 +02:00
Louis Gesbert
e925ec1795 Swap boxing and annotations in expressions
This was the only reasonable solution I found to the issue raised
[here](https://github.com/CatalaLang/catala/pull/334#discussion_r987175884).

This was a pretty tedious rewrite, but it should now ensure we are doing things
correctly. As a bonus, the "smart" expression constructors are now used
everywhere to build expressions (so another refactoring like this one should be
much easier) and this makes the code overall feel more
straightforward (`Bindlib.box_apply` or `let+` no longer need to be visible!)

---

Basically, we were using values of type `gexpr box = naked_gexpr marked box`
throughout when (re-)building expressions. This was done 99% of the time by
using `Bindlib.box_apply add_mark naked_e` right after building `naked_e`. In
lots of places, we needed to recover the annotation of this expression later on,
typically to build its parent term (to inherit the position, or build the type).

Since it wasn't always possible to wrap these uses within `box_apply` (esp. as
bindlib boxes aren't a monad), here and there we had to call `Bindlib.unbox`,
just to recover the position or type. This had the very unpleasant effect of
forcing the resolution of the whole box (including applying any stored closures)
to reach the top-level annotation which isn't even dependant on specific
variable bindings. Then, generally, throwing away the result.

Therefore, the change proposed here transforms
- `naked_gexpr marked Bindlib.box` into
- `naked_gexpr Bindlib.box marked` (aliased to `boxed_gexpr` or `gexpr boxed` for
convenience)

This means only
1. not fitting the mark into the box right away when building, and
2. accessing the top-level mark directly without unboxing

The functions for building terms from module `Shared_ast.Expr` could be changed
easily. But then they needed to be consistently used throughout, without
manually building terms through `Bindlib.apply_box` -- which covers most of the
changes in this patch.

`Expr.Box.inj` is provided to swap back to a box, before binding for example.

Additionally, this gives a 40% speedup on `make -C examples pass_all_tests`,
which hints at the amount of unnecessary work we were doing --'
2022-10-07 18:00:23 +02:00
Louis Gesbert
0fdefacf7c Add marks to scopelang Call 2022-10-04 14:50:37 +02:00
Louis Gesbert
41315dc650 Scopelang: add toplevel mark for convenience
it allows to discriminate typed and non-typed ASTs
2022-10-04 14:50:37 +02:00
Louis Gesbert
3b4b070aaa Fix typing 2022-10-04 14:50:37 +02:00
Louis Gesbert
2955ef3235 Implement typing at the scopelang level 2022-10-04 14:50:37 +02:00
Louis Gesbert
51f79af13e Generalise the types to allow scopelang ASTs to be typed 2022-10-04 14:50:37 +02:00
Louis Gesbert
af9f497ffb Implement typing of desugared/scopelang and lcalc terms
Note that this is incomplete in the case of desugared/scopelang because we only
have typing for expressions yet, and the scope/program structure is different.

The code allows passing an environment of types for scope/subscope variables in
order to resolve `ELocation` terms, but that's unused until we implement
scopelang typing at the scope level.
2022-10-04 14:50:37 +02:00
Louis Gesbert
84c78a234f
Make desugared and scopelang use the 'a mark type for AST annotations
This gives further uniformity in their interfaces and allows more common
handling.

The next step will be for all the `Expr.make_*` functions to work on expressions
annotated with the `'a mark` type, correctly propagating type information when
it is present. Then we could even imagine early propagation of type
information (without complete inference), which could for example be used for
overloaded operator disambiguation.
2022-08-29 11:29:24 +02:00
Louis Gesbert
5e9c3d630e
Same treatment for typ and marked_typ 2022-08-29 11:29:24 +02:00
Louis Gesbert
be58610061
Rename marked_expr -> expr, expr -> naked_expr throughout
Since the marked kind is used throughout, this should be more clear
2022-08-29 11:29:23 +02:00
Louis Gesbert
8f7ba5ccaf
Rename marked_gexpr -> gexpr, gexpr -> naked_gexpr
Since the marked kind is used throughout, this should be more clear
2022-08-29 11:29:23 +02:00
Louis Gesbert
ef36b18dfe And finally the desugared AST as well 2022-08-26 11:31:14 +02:00
Louis Gesbert
01cc957b3b Used shared_ast for scopelang expressions 2022-08-26 11:31:14 +02:00