Commit Graph

72 Commits

Author SHA1 Message Date
Louis Gesbert
7db63e5f78 Simplification: store paths in Uids
rather than scattered in structures

The context is still hierarchical for defs though, so one needs to retrieve the
path to lookup in the correct context for info. Exceptions are enums and struct
defs, which are re-exposed at toplevel.
2023-08-31 18:31:48 +02:00
Louis Gesbert
72882f82df Reformat 2023-08-31 17:55:36 +02:00
Louis Gesbert
9bac045d03 Implement module lookups for scopes, structs, and enums 2023-08-31 17:54:39 +02:00
Louis Gesbert
db34c9a848 Generalise the expression printer
This patch functorises the generic expression printer, in order to be able to
re-use it for end-user printing.

It makes it possible to have an end-user, localised printer that shares the code
for e.g. priority and automatic parens handling.

A generic AST rewriting that disambiguates variables (very simple to write with
bindlib) is also added and used in the OCaml backend for something safer than
just appending `_user` (-- this also handles clashing variables that could be
introduced during compilation which would have generated wrong code before this)

Finally, the `explain` plugin is adapted to use the new printer.

Ah, and `String.format_t` was tweaked to correctly print strings that might
contain unicode without breaking alignment, and should be used instead of
`format_string` or `%s` whenever unicode can be expected.
2023-07-11 17:33:56 +02:00
Denis Merigoux
5f48e5dac1
Merge branch 'master' into closure_conversion 2023-06-20 11:02:13 +02:00
Louis Gesbert
237dca2e75 Some commented code cleanup and clarifications 2023-06-19 16:36:09 +02:00
Louis Gesbert
e224e87f71 Wip support for modules
(first working dynload test with compilation done by manual calls to ocaml)

A few pieces of the puzzle:

* Loading of interfaces only from Catala files
* Registration of toplevel values in modules compiled to OCaml, to allow access
  using dynlink
* Shady conversion from OCaml runtime values to/from Catala expressions, to
  allow interop (ffi) of compiled modules and the interpreter
2023-06-15 17:56:57 +02:00
Louis Gesbert
2b7beeefb2 Rename 'IdentName' to 'Ident' 2023-06-15 17:55:49 +02:00
Denis Merigoux
cdae3e43ac
Improve names of temp variable in monadic pass 2023-06-12 15:02:08 +02:00
Louis Gesbert
209be6b758 Improve integration of marks into the main AST
Two interdependent changes here:
1. Enforce all instances of Shared_ast.gexpr to use the generic type for marks.
   This makes the interfaces a tad simpler to manipulate: you now write
   `('a, 'm) gexpr` rather than `('a, 'm mark) gexpr`.
2. Define a polymorphic `Custom` mark case for use by pass-specific annotations.
   And leverage this in the typing module
2023-05-17 17:37:00 +02:00
Louis Gesbert
fc531777c0 Rework and normalise the Marked interface
The module is renamed to `Mark`, and functions renamed to avoid redundancy:

`Marked.mark` is now `Mark.add`
`Marked.unmark` is now `Mark.remove`
`Marked.map_under_mark` is now simply `Mark.map`
etc.

`Marked.same_mark_as` is replaced by `Mark.copy`, but with the arguments
swapped (which seemed more convenient throughout)

Since a type `Mark.t` would indicate a mark, and to avoid confusion, the type
`Marked.t` is renamed to `Mark.ed` as a shorthand for `Mark.marked` ; this part
can easily be removed if that's too much quirkiness.
2023-05-17 17:37:00 +02:00
Louis Gesbert
ba52aae401 Cleanup: definitions.ml is not for values
A module without mli is ok as long as it only contains types

Here we already stretch it a bit with some functor applications, but having
toplevel values defeats the expectation that you can safely `open` this module.
2023-05-17 13:26:47 +02:00
Louis Gesbert
5e26c5c83d Yet more printer improvements
- Fix the printer for scopes
- Improve the printer for struct types
- Remove `Print.expr'`. Use `Expr.format` as the function with simplified arguments instead.
2023-05-02 16:33:23 +02:00
Louis Gesbert
83e7a845fe Cleanup expr printer interface
- `Print.expr` no longer needs the context
- This removes the need for `expr ~debug` + `expr_debug` ;
  use `Print.expr` for normal (non-debug) output,
  and `Print.expr' ?debug ()` for possibly debug output.
- This improves consistency of debug expr output in many places
- Prints simplified operators (without type suffix) in non-verbose mode

(this patch also fixes some cases of `Expr.skip_wrappers` and leverages the
binder equality provided by Bindlib)
2023-05-02 13:32:16 +02:00
Louis Gesbert
49bd5b1915 Cleanup type of Expr.make_app 2023-04-24 15:14:54 +02:00
Denis Merigoux
d384db4e71
Reestablish some default constructor optimizations 2023-04-21 14:35:10 +02:00
Denis Merigoux
067c7b9155
Merge branch 'master' into adelaett-withoutexceptionsfix 2023-04-21 10:55:36 +02:00
Louis Gesbert
55d343d81c Version that uses object types instead of polymorphic variants
in order to get the row polymorphism controlling the GADT that encodes our AST
2023-04-20 13:51:20 +02:00
adelaett
02eeb4ad11
Include Bindlib_ext to Expr.Box 2023-04-14 14:18:28 +02:00
adelaett
ebf72213a7
deadcode 2023-04-14 14:12:36 +02:00
adelaett
ddeaa67ff7
Expr.eid -> Expr.fun_id 2023-04-14 14:07:51 +02:00
adelaett
cc66023e51
Thunking justifications and conclusion in avoid_translation pass 2023-04-12 10:58:21 +02:00
adelaett
3e35d4b826
Merge branch 'master' into adelaett-withoutexceptionsfix 2023-04-11 11:49:22 +02:00
alain
ec40de83fc
Merge branch 'master' into adelaett-withoutexceptionsfix 2023-04-06 13:57:22 +02:00
Louis Gesbert
0098f00512 Yet some more small improvements to the AST encoding 2023-04-05 10:32:58 +02:00
Louis Gesbert
0c1cd481e1 Interpreter on dcalc + lcalc (the simple way)
I made some changes in the meantime, and had to factorise e.g. the handling of
the `EEmptyError` case, but this is the simple approach type-wise of making the
function type for `∀ 'a. 'a —> 'a` (with `assert false` match cases), then
restricting its type do `dcalc` or `lcalc` in the `.mli`.
2023-04-05 10:32:58 +02:00
Louis Gesbert
79ff776d2e Multi-pass interperter: typing (but useless) version 2023-04-05 10:32:58 +02:00
Denis Merigoux
38b0041bb8
Add common linting passes to Catala (#438) 2023-04-03 14:01:02 +02:00
Denis Merigoux
d147238088
Apply suggestions by @altgr 2023-04-03 13:42:14 +02:00
adelaett
e9ead93f3f fix typing errors 2023-03-31 16:01:05 +02:00
adelaett
573df8416f Merge branch 'master' into adelaett-withoutexceptionsfix 2023-03-31 15:52:06 +02:00
Denis Merigoux
565aa23b8f
Implemented some lints 2023-03-31 11:47:44 +02:00
Louis Gesbert
038861a52c Generic mapping function across different ASTs
Used in lcalc/compile_with_exceptions only at the moment
2023-03-30 18:57:51 +02:00
Louis Gesbert
1208744c6b EmptyError is no longer a literal
it's much simpler to handle it as an AST node, as that makes the literal
identical across all AST passes.
2023-03-30 18:54:50 +02:00
Louis Gesbert
a415355a39 Rework the AST Gadt to allow merging of different ASTs
The phantom polymorphic variant qualifying AST nodes is reversed:
- previously, we were explicitely restricting each AST node to the passes where it belonged using a closed type (e.g. `[< dcalc | lcalc]`)
- now, each node instead declares the "feature" it provides using an open type (e.g. `[> 'Exceptions ]`)
- then the AST for a specific pass limits the features it allows with a closed type

The result is that you can mix and match all features if you wish,
even if the result is not a valid AST for any given pass. More
interestingly, it's now easier to write a function that works on
different ASTs at once (it's the inferred default if you don't write a
type restriction).

The opportunity was also taken to simplify the encoding of the
operators, which don't need a second type parameter anymore.
2023-03-30 15:30:08 +02:00
adelaett
21577ff1ba introducing usefull term in the shared ast 2023-03-10 15:33:58 +01:00
adelaett
6c3f0af9e0 invariant assertion checking 2023-02-27 11:20:59 +01:00
adelaett
9319e94617 fix typo 2023-02-27 11:20:59 +01:00
adelaett
0262019d45 make app 2023-02-27 11:20:59 +01:00
adelaett
363ef39704 let case 2023-02-27 11:20:59 +01:00
Louis Gesbert
0540cd31fe Allow ETuple, ETupleAccess on all ASTs
they used to be only allowed on lcalc
2023-02-13 10:51:42 +01:00
Louis Gesbert
fea01cfe4c Add overloaded operators for the common operations
This uses the same disambiguation mechanism put in place for
structures, calling the typer on individual rules on the desugared AST
to propagate types, in order to resolve ambiguous operators like `+`
to their strongly typed counterparts (`+!`, `+.`, `+$`, `+@`, `+$`) in
the translation to scopelang.

The patch includes some normalisation of the definition of all the
operators, and classifies them based on their typing policy instead of
their arity. It also adds a little more flexibility:
- a couple new operators, like `-` on date and duration
- optional type annotation on some aggregation constructions

The `Shared_ast` lib is also lightly restructured, with the `Expr`
module split into `Type`, `Operator` and `Expr`.
2022-12-13 11:55:24 +01:00
Louis Gesbert
8960e5dbbc Add typing-based disambiguation pass after desugaring
Some typing errors are changed a little, because they get triggered during the
typing of the disambiguation pass, which does not specify the expected return
type (it's an expected invariant that it should not be needed for
disambiguation).

It would be possible to still specify these types during disambiguation just to
get the same errors, but since the newer ones don't appear to be clearly worse
at the moment, it has not been done.
2022-11-28 16:38:09 +01:00
Louis Gesbert
3f2aa19e97 Add ambiguous StructAccess for desugared
to be resolved in scopelang
2022-11-28 16:38:09 +01:00
Louis Gesbert
660e5775de Rename utils to catala_utils 2022-11-28 16:38:09 +01:00
Louis Gesbert
b329afbbdb Rename all Map/Set calls accordingly
This is just a bunch of `sed` calls:
```shell
sed -i 's/ScopeSet/ScopeName.Set/g' compiler/**/*.ml*
sed -i 's/ScopeMap/ScopeName.Map/g' compiler/**/*.ml*
sed -i 's/StructMap/StructName.Map/g' compiler/**/*.ml*
sed -i 's/StructSet/StructName.Set/g' compiler/**/*.ml*
sed -i 's/EnumMap/EnumName.Map/g' compiler/**/*.ml*
sed -i 's/EnumSet/EnumName.Set/g' compiler/**/*.ml*
sed -i 's/StructFieldName/StructField/g' compiler/**/*.ml*
sed -i 's/StructFieldMap/StructField.Map/g' compiler/**/*.ml*
sed -i 's/StructFieldSet/StructField.Set/g' compiler/**/*.ml*
sed -i 's/EnumConstructorMap/EnumConstructor.Map/g' compiler/**/*.ml*
sed -i 's/EnumConstructorSet/EnumConstructor.Set/g' compiler/**/*.ml*
sed -i 's/RuleMap/RuleName.Map/g' compiler/**/*.ml*
sed -i 's/RuleSet/RuleName.Set/g' compiler/**/*.ml*
sed -i 's/LabelMap/LabelName.Map/g' compiler/**/*.ml*
sed -i 's/LabelSet/LabelName.Set/g' compiler/**/*.ml*
sed -i 's/ScopeVarMap/ScopeVar.Map/g' compiler/**/*.ml*
sed -i 's/ScopeVarSet/ScopeVar.Set/g' compiler/**/*.ml*
sed -i 's/SubScopeNameMap/SubScopeName.Map/g' compiler/**/*.ml*
sed -i 's/SubScopeNameSet/SubScopeName.Set/g' compiler/**/*.ml*
```

... and reformat
2022-11-28 16:38:09 +01:00
Louis Gesbert
4dfb4ab44f Add some more doc for Expr.shallow_fold and Expr.map_gather 2022-11-21 17:54:17 +01:00
Louis Gesbert
4ae392c900 AST refactoring
Many changes got bundled in here and would be too tedious to separate.

Closes #330

See changes in `shared_ast/definitions.ml` to check the main point.

- the biggest change is a modification of the struct and enum types in
  expressions: they are now stored as `Map`s throughout passes, and no longer
  converted to indexed lists after scopelang. Their accessors are also changed,
  and tuples only exist in Lcalc (they're used for closure conversion).

  This implied adding some more information in the contexts, to keep the mapping
  between struct fields and scope output variables. It should also be much more
  robust (no longer relying on assumptions upon different orderings).

- another very pervasive change is more cosmetic: the rewrite of the main AST to
  use inline records, labelling individual subfields.

- moved the checks for correct definitions and accesses of structures from
  `Scope_to_dcalc` to `Typing`

- defining some new shallow iterators in module `Shared_ast.Expr`, and
  factorising a few same-pass rewriting functions accordingly (closure
  conversion, optimisations, etc.)

- some smaller style improvements (ensuring we use the proper compare/equal
  functions instead of `=` in a few `when` closes, for example)
2022-11-17 18:16:09 +01:00
Louis Gesbert
73173285e4 Scope calls: proper handling of context vars
Also proper error messages on bad scope input specifications.

* Still needs more tests
2022-10-25 11:30:45 +02:00
Louis Gesbert
41d6d3cbe9 Make scopes directly callable
Quite a few changes are included here, some of which have some extra
implications visible in the language:

- adds the `Scope of { -- input_v: value; ... }` construct in the language

- handle it down the pipeline:
  * `ScopeCall` in the surface AST
  * `EScopeCall` in desugared and scopelang
  * expressions are now traversed to detect dependencies between scopes
  * transformed into a normal function call in dcalc

- defining a scope now implicitely defines a structure with the same name, with
  the output variables of the scope defined as fields. This allows us to type
  the return value from a scope call and access its fields easily.
  * the implications are mostly in surface/name_resolution.ml code-wise
  * the `Scope_out` struct that was defined in scope_to_dcalc is no longer
    needed/used and the fields are no longer renamed (changes some outputs; the
    explicit suffix for variables with multiple states is ignored as well)
  * one benefit is that disambiguation works just like for structures when there
    are conflicts on field names
  * however, it's now a conflict if a scope and a structure have the same
    name (side-note: issues with conflicting enum / struct names or scope
    variables / subscope names were silent and are now properly reported)

- you can consequently use scope names as types for variables as well. Writing
  literals is not allowed though, they can only be obtained by calling the
  scope.

Remaining TODOs:

- context variables are not handled properly at the moment

- error handling on invalid calls

- tests show a small error message regression; lots of examples will need
  tweaking to avoid scope/struct name or struct fields / output variable
  conflicts

- add a `->` syntax to make struct field access distinct from scope output var
  access, enforced with typing. This is expected to reduce confusion of users
  and add a little typing precision.

- document the new syntax & implications (tutorial, cheat-sheet)

- a consequence of the changes is that subscope variables also can now be typed.
  A possible future evolution / simplification would be to rewrite subscopes as
  explicit scope calls early in the pipeline. That could also allow to manipulate
  them as expressions (bind them in let-ins, return them...)
2022-10-21 17:17:26 +02:00