Commit Graph

211 Commits

Author SHA1 Message Date
Louis Gesbert
fea01cfe4c Add overloaded operators for the common operations
This uses the same disambiguation mechanism put in place for
structures, calling the typer on individual rules on the desugared AST
to propagate types, in order to resolve ambiguous operators like `+`
to their strongly typed counterparts (`+!`, `+.`, `+$`, `+@`, `+$`) in
the translation to scopelang.

The patch includes some normalisation of the definition of all the
operators, and classifies them based on their typing policy instead of
their arity. It also adds a little more flexibility:
- a couple new operators, like `-` on date and duration
- optional type annotation on some aggregation constructions

The `Shared_ast` lib is also lightly restructured, with the `Expr`
module split into `Type`, `Operator` and `Expr`.
2022-12-13 11:55:24 +01:00
Louis Gesbert
3f2aa19e97 Add ambiguous StructAccess for desugared
to be resolved in scopelang
2022-11-28 16:38:09 +01:00
Louis Gesbert
af2f5dbe19 Tweak error message location printing 2022-11-28 16:38:09 +01:00
Louis Gesbert
9fc4c0c10c Define Catala_utils.String as an overlay to stdlib string 2022-11-28 16:38:09 +01:00
Louis Gesbert
660e5775de Rename utils to catala_utils 2022-11-28 16:38:09 +01:00
Louis Gesbert
b329afbbdb Rename all Map/Set calls accordingly
This is just a bunch of `sed` calls:
```shell
sed -i 's/ScopeSet/ScopeName.Set/g' compiler/**/*.ml*
sed -i 's/ScopeMap/ScopeName.Map/g' compiler/**/*.ml*
sed -i 's/StructMap/StructName.Map/g' compiler/**/*.ml*
sed -i 's/StructSet/StructName.Set/g' compiler/**/*.ml*
sed -i 's/EnumMap/EnumName.Map/g' compiler/**/*.ml*
sed -i 's/EnumSet/EnumName.Set/g' compiler/**/*.ml*
sed -i 's/StructFieldName/StructField/g' compiler/**/*.ml*
sed -i 's/StructFieldMap/StructField.Map/g' compiler/**/*.ml*
sed -i 's/StructFieldSet/StructField.Set/g' compiler/**/*.ml*
sed -i 's/EnumConstructorMap/EnumConstructor.Map/g' compiler/**/*.ml*
sed -i 's/EnumConstructorSet/EnumConstructor.Set/g' compiler/**/*.ml*
sed -i 's/RuleMap/RuleName.Map/g' compiler/**/*.ml*
sed -i 's/RuleSet/RuleName.Set/g' compiler/**/*.ml*
sed -i 's/LabelMap/LabelName.Map/g' compiler/**/*.ml*
sed -i 's/LabelSet/LabelName.Set/g' compiler/**/*.ml*
sed -i 's/ScopeVarMap/ScopeVar.Map/g' compiler/**/*.ml*
sed -i 's/ScopeVarSet/ScopeVar.Set/g' compiler/**/*.ml*
sed -i 's/SubScopeNameMap/SubScopeName.Map/g' compiler/**/*.ml*
sed -i 's/SubScopeNameSet/SubScopeName.Set/g' compiler/**/*.ml*
```

... and reformat
2022-11-28 16:38:09 +01:00
Louis Gesbert
47799ea24f Uniform naming of conversion modules across compilation passes 2022-11-22 12:08:18 +01:00
Louis Gesbert
4ae392c900 AST refactoring
Many changes got bundled in here and would be too tedious to separate.

Closes #330

See changes in `shared_ast/definitions.ml` to check the main point.

- the biggest change is a modification of the struct and enum types in
  expressions: they are now stored as `Map`s throughout passes, and no longer
  converted to indexed lists after scopelang. Their accessors are also changed,
  and tuples only exist in Lcalc (they're used for closure conversion).

  This implied adding some more information in the contexts, to keep the mapping
  between struct fields and scope output variables. It should also be much more
  robust (no longer relying on assumptions upon different orderings).

- another very pervasive change is more cosmetic: the rewrite of the main AST to
  use inline records, labelling individual subfields.

- moved the checks for correct definitions and accesses of structures from
  `Scope_to_dcalc` to `Typing`

- defining some new shallow iterators in module `Shared_ast.Expr`, and
  factorising a few same-pass rewriting functions accordingly (closure
  conversion, optimisations, etc.)

- some smaller style improvements (ensuring we use the proper compare/equal
  functions instead of `=` in a few `when` closes, for example)
2022-11-17 18:16:09 +01:00
Louis Gesbert
7267543ca1 Rename Expr.Box.inj to Expr.Box.lift
it is more consistent with the naming of functions in Bindlib.
2022-10-21 15:35:49 +02:00
Louis Gesbert
e925ec1795 Swap boxing and annotations in expressions
This was the only reasonable solution I found to the issue raised
[here](https://github.com/CatalaLang/catala/pull/334#discussion_r987175884).

This was a pretty tedious rewrite, but it should now ensure we are doing things
correctly. As a bonus, the "smart" expression constructors are now used
everywhere to build expressions (so another refactoring like this one should be
much easier) and this makes the code overall feel more
straightforward (`Bindlib.box_apply` or `let+` no longer need to be visible!)

---

Basically, we were using values of type `gexpr box = naked_gexpr marked box`
throughout when (re-)building expressions. This was done 99% of the time by
using `Bindlib.box_apply add_mark naked_e` right after building `naked_e`. In
lots of places, we needed to recover the annotation of this expression later on,
typically to build its parent term (to inherit the position, or build the type).

Since it wasn't always possible to wrap these uses within `box_apply` (esp. as
bindlib boxes aren't a monad), here and there we had to call `Bindlib.unbox`,
just to recover the position or type. This had the very unpleasant effect of
forcing the resolution of the whole box (including applying any stored closures)
to reach the top-level annotation which isn't even dependant on specific
variable bindings. Then, generally, throwing away the result.

Therefore, the change proposed here transforms
- `naked_gexpr marked Bindlib.box` into
- `naked_gexpr Bindlib.box marked` (aliased to `boxed_gexpr` or `gexpr boxed` for
convenience)

This means only
1. not fitting the mark into the box right away when building, and
2. accessing the top-level mark directly without unboxing

The functions for building terms from module `Shared_ast.Expr` could be changed
easily. But then they needed to be consistently used throughout, without
manually building terms through `Bindlib.apply_box` -- which covers most of the
changes in this patch.

`Expr.Box.inj` is provided to swap back to a box, before binding for example.

Additionally, this gives a 40% speedup on `make -C examples pass_all_tests`,
which hints at the amount of unnecessary work we were doing --'
2022-10-07 18:00:23 +02:00
Louis Gesbert
14f1ebfd0a Reformat 2022-10-04 14:50:37 +02:00
Louis Gesbert
ea114bada2 Fix one more typing mismatch 2022-10-04 14:50:37 +02:00
Louis Gesbert
0bb9cce341 Simplify a few mark operations 2022-10-04 14:50:37 +02:00
Louis Gesbert
d93b699a4c Forward types in the Expr.make_* constructors
Also add some safeguards against bad propagation of types (e.g. checking the
arrow type of functions upon application); partly disabled at the moment since
they don't pass yet but that'll be further work.
2022-10-04 14:50:37 +02:00
Denis Merigoux
6130151c8e
Fix bug and typos 2022-09-05 14:50:37 +02:00
Denis Merigoux
e5963e5381
Merge branch 'master' into altgr_allmarks 2022-08-29 11:57:06 +02:00
Louis Gesbert
7e0d24efd2
Make all supertypes use ('a, 't) gexpr as parameter instead of naked_gexpr 2022-08-29 11:29:24 +02:00
Louis Gesbert
5e9c3d630e
Same treatment for typ and marked_typ 2022-08-29 11:29:24 +02:00
Louis Gesbert
be58610061
Rename marked_expr -> expr, expr -> naked_expr throughout
Since the marked kind is used throughout, this should be more clear
2022-08-29 11:29:23 +02:00
Louis Gesbert
8f7ba5ccaf
Rename marked_gexpr -> gexpr, gexpr -> naked_gexpr
Since the marked kind is used throughout, this should be more clear
2022-08-29 11:29:23 +02:00
Louis Gesbert
e10771c187
Make all supertypes use ('a, 't) gexpr as parameter instead of naked_gexpr 2022-08-29 10:57:21 +02:00
Louis Gesbert
a9c8bab2b3
Same treatment for typ and marked_typ 2022-08-29 10:57:21 +02:00
Louis Gesbert
0a23dc526d
Rename marked_expr -> expr, expr -> naked_expr throughout
Since the marked kind is used throughout, this should be more clear
2022-08-29 10:57:21 +02:00
Louis Gesbert
493b6703a7
Rename marked_gexpr -> gexpr, gexpr -> naked_gexpr
Since the marked kind is used throughout, this should be more clear
2022-08-29 10:57:21 +02:00
Louis Gesbert
01cc957b3b Used shared_ast for scopelang expressions 2022-08-26 11:31:14 +02:00
Louis Gesbert
54eee2edea Rationalise the tuple / enum types
This will allow to unify with types used earlier in the
pipeline (`Scopelang.Ast.typ`).

It seems cleaner! But some areas may warrant a later clean-up, in particular
handling of options and their types in the backends, or possible name conflicts
of structs/enums with built-in types when printing.
2022-08-23 15:48:06 +02:00
Louis Gesbert
4caf828e48 Additional cleanup/fixes on the compiler refactoring
following review ^^
2022-08-23 00:13:02 +02:00
Louis Gesbert
576e0fb3ff Factorise AST printers
Note that there were significant differences between the two printers (see the test diff!). Overall the `dcalc` one seemed newer so that's what I took, with only the required additions from `lcalc` (exceptions, raise and catch)
2022-08-22 19:28:27 +02:00
Louis Gesbert
ae2801be6d Move mode handling code from dcalc to shared_ast
Handling code should now be reasonably well sorted between `Shared_ast.{Var,Expr,Scope,Program}`

The function parameters (e.g. `make_let_in`) could be removed from the
scope handling functions since now the types are compatible, which
makes them much easier to read.
2022-08-22 19:28:27 +02:00
Louis Gesbert
8e7f65d204 Split Shared_ast.Expr of scope and program functions 2022-08-22 19:28:27 +02:00
Louis Gesbert
4bb49c14f1 Simplify some type aliases 2022-08-22 19:28:27 +02:00
Louis Gesbert
06dbab74d2 reformat 2022-08-22 19:28:27 +02:00
Louis Gesbert
2b6ee8dd4b Leverage the shared AST: big cleanup (part I) 2022-08-22 19:28:21 +02:00
Louis Gesbert
988e5eff1c Split the shared AST into a separate lib 2022-08-22 19:16:28 +02:00
Denis Merigoux
7ee971c4e1
Remove unused type definitions 2022-08-16 14:33:37 +02:00
Louis Gesbert
0b0e774d1c More factorisation, in particular for variables 2022-08-12 17:18:06 +02:00
Louis Gesbert
ebf97a0995 Pass-specific literals 2022-08-12 16:55:32 +02:00
Louis Gesbert
b5579cde3d Generalise the expressions between dcalc and lcalc
The huge benefit of this approach is that almost no changes are needed and we get compatible types between dcalc and lcalc, allowing to deduplicate a few functions.

It might not be the best in the long run: there are still benefits in factorising small parts of the AST as suggested in #157, and this forces a central AST definition that makes the nanopass-like approach a bit less legible.

Still, I think it's a step in the right direction and it doesn't really lock us in keeping to use the big GADT (as the minimal cascade of changes show).
2022-08-12 16:55:30 +02:00
Emile Rolley
ba620fca28 ocamlformat: new break-infix rule 2022-08-05 10:55:48 +02:00
Emile Rolley
d85812109c refactor(compiler): remove the camomile dependency due to the new Utils.String_common module based on Ubase 2022-08-05 10:55:45 +02:00
Louis Gesbert
a569589193 Small improvements to the Python and OCaml pretty-printers 2022-08-04 20:43:39 +02:00
Denis Merigoux
4845196b5b Add source positions in all backends exceptions 2022-07-29 18:42:14 +02:00
Denis Merigoux
8d3e283669 Fix some bugs of JSOO plugin 2022-07-28 15:02:43 +02:00
Denis Merigoux
d17ac0bc39
More nitpicks 2022-07-22 18:04:16 +02:00
Denis Merigoux
fa55a83fb4
Merge branch 'master' into 290-jsoo-wrapper-plugin 2022-07-22 17:54:51 +02:00
Emile Rolley
ad0efd3447 refactor(ocaml): wrap enum type inside their own module like struct ones 2022-07-22 16:52:56 +02:00
Emile Rolley
0a9e563450 refactor(to_ocaml): format_to_struct_type -> format_to_module_name 2022-07-22 16:52:56 +02:00
Emile Rolley
3dcf856ec6 refactor(cli): add Cli.call_unstyled 2022-07-22 16:52:56 +02:00
Emile Rolley
b2bba6eaf0 feat(jsoo): factorize log events related function to the object eventManager 2022-07-22 16:52:56 +02:00
Emile Rolley
37a8cf7090 fix(rebase): changes Lcalc and Dcalc AST manipulation according #272 2022-07-22 16:52:56 +02:00