Commit Graph

40 Commits

Author SHA1 Message Date
Louis Gesbert
14a378a33d Translation to scalc: fix renaming in blocks
Statements are often flattened, in which case their idents need to be
conflict-free. We pass along the renaming context to handle this.
2024-08-28 18:12:32 +02:00
Louis Gesbert
081e07378a Renaming: move to its own module 2024-08-28 18:12:28 +02:00
Louis Gesbert
1b6da0b572 reformat (renaming in scalc) 2024-08-28 18:10:41 +02:00
Louis Gesbert
1230f787d6 Renaming: use in the scalc translation and in Python 2024-08-28 18:10:36 +02:00
Louis Gesbert
8b06511915 Renaming: more customisation
in particular, this avoids regression with reused struct fields getting renamed
with indices, which would have required changes in e.g.
`french_law/ocaml/bench.ml`
2024-08-28 17:18:26 +02:00
Louis Gesbert
acc13867bf Fix: return module items in file order
This affects their UIDs, and the order in which they later get traversed.

In particular, it was affecting the struct names renaming.
2024-08-28 17:18:26 +02:00
Louis Gesbert
b9156bb60e Implement safe renaming of idents for backend printing
Previously we had some heuristics in the backends trying to achieve this with a
lot of holes ; this should be much more solid, relying on `Bindlib` to do the
correct renamings.

**Note1**: it's not plugged into the backends other than OCaml at the moment.

**Note2**: the related, obsolete heuristics haven't been cleaned out yet

**Note3**: we conservatively suppose a single namespace at the moment. This is
required for e.g. Python, but it forces vars named like struct fields to be
renamed, which is more verbose in e.g. OCaml. The renaming engine could be
improved to support different namespaces, with a way to select how to route the
different kinds of identifiers into them.

Similarly, customisation for what needs to be uppercase or lowercase is not
available yet.

**Note4**: besides excluding keywords, we should also be careful to exclude (or
namespace):
- the idents used in the runtime (e.g. `o_add_int_int`)
- the dynamically generated idents (e.g. `embed_*`)

**Note5**: module names themselves aren't handled yet. The reason is that they
must be discoverable by the user, and even need to match the filenames, etc. In
other words, imagine that `Mod` is a keyword in the target language. You can't
rename a module called `Mod` to `Mod1` without knowing the whole module context,
because that would destroy the mapping for a module already called `Mod1`.

A reliable solution would be to translate all module names to e.g.
`CatalaModule_*`, which we can assume will never conflict with any built-in, and
forbid idents starting with that prefix. We may also want to restrict their
names to ASCII ? Currently we use a projection, but what if I have two modules
called `Là` and `La` ?
2024-08-28 17:18:26 +02:00
Louis Gesbert
f04e889173 Pass the "external module" info along passes 2024-05-28 11:43:50 +02:00
Louis Gesbert
403156b36e Computation and checking of module hashes
This includes a few separate changes:

- pass visibility information of declarations (depending on wether the
  declaration was in a ```catala-metadata block or not)

- add reasonable hash computation functions to discriminate the interfaces. In
  particular:
  * Uids have a `hash` function that depends on their string, but not on their
    actual uid (which is not stable between runs of the compiler) ; the existing
    `hash` function and its uses have been renamed to `id`.
  * The `Hash` module provides the tools to properly combine hashes, etc. While
    we rely on `Hashtbl.hash` for the atoms, we take care not to use it on any
    recursive structure (it relies on a bounded traversal).

- insert the hashes in the artefacts, and properly check and report those (for
  OCaml)

**Remains to do**:

- Record and check the hashes in the other backends

- Provide a way to get stable inline-test outputs in the presence of module
  hashes

- Provide a way to write external modules that don't break at every Catala
  update.
2024-05-28 11:43:50 +02:00
Louis Gesbert
97ae62384e Add externals to scalc, working test with Python backend 2024-02-26 14:56:43 +01:00
Louis Gesbert
e308ff8d02 Generalise the definition of lists of nested binders 2024-02-09 18:33:41 +01:00
Louis Gesbert
3649f92975 Rework resolution of module elements
This changes the `decl_ctx` to be toplevel only, with flattened references to
uids for most elements. The module hierarchy, which is still useful in a few
places, is kept separately.

Module names are also changed to UIDs early on, and support for module aliases
has been added (needs testing).

This resolves some issues with lookup, and should be much more robust, as well
as more convenient for most lookups.

The `decl_ctx` was also extended for string ident lookups, which avoids having
to keep the desugared resolution structure available throughout the compilation
chain.
2023-11-30 21:14:12 +01:00
Louis Gesbert
3b0e576a24 Fix module name propagation 2023-10-13 16:13:02 +02:00
Louis Gesbert
f162f6e9bd Improve handling of module name definitions
and add some sanity-checks for consistency of used modules w.r.t. actually
loaded modules.
2023-09-27 13:14:03 +02:00
Denis Merigoux
9cecf5587a
Register surface syntax languge in program troughout the compilation chain 2023-09-22 18:05:26 +02:00
Louis Gesbert
8e33355ead Reformat 2023-08-31 18:31:48 +02:00
Louis Gesbert
7db63e5f78 Simplification: store paths in Uids
rather than scattered in structures

The context is still hierarchical for defs though, so one needs to retrieve the
path to lookup in the correct context for info. Exceptions are enums and struct
defs, which are re-exposed at toplevel.
2023-08-31 18:31:48 +02:00
Louis Gesbert
72882f82df Reformat 2023-08-31 17:55:36 +02:00
Louis Gesbert
9bac045d03 Implement module lookups for scopes, structs, and enums 2023-08-31 17:54:39 +02:00
Louis Gesbert
e224e87f71 Wip support for modules
(first working dynload test with compilation done by manual calls to ocaml)

A few pieces of the puzzle:

* Loading of interfaces only from Catala files
* Registration of toplevel values in modules compiled to OCaml, to allow access
  using dynlink
* Shady conversion from OCaml runtime values to/from Catala expressions, to
  allow interop (ffi) of compiled modules and the interpreter
2023-06-15 17:56:57 +02:00
Louis Gesbert
209be6b758 Improve integration of marks into the main AST
Two interdependent changes here:
1. Enforce all instances of Shared_ast.gexpr to use the generic type for marks.
   This makes the interfaces a tad simpler to manipulate: you now write
   `('a, 'm) gexpr` rather than `('a, 'm mark) gexpr`.
2. Define a polymorphic `Custom` mark case for use by pass-specific annotations.
   And leverage this in the typing module
2023-05-17 17:37:00 +02:00
Denis Merigoux
f877544368
Remove optimizations for big tests 2023-04-18 15:56:04 +02:00
adelaett
02eeb4ad11
Include Bindlib_ext to Expr.Box 2023-04-14 14:18:28 +02:00
adelaett
0e8eed7ee1 program equality function 2023-04-07 12:10:08 +02:00
adelaett
618ff0518d move printing of program & scope to the Print module 2023-04-07 11:26:10 +02:00
alain
ec40de83fc
Merge branch 'master' into adelaett-withoutexceptionsfix 2023-04-06 13:57:22 +02:00
adelaett
61bdd751e4 corrected program equality 2023-04-04 15:18:08 +02:00
adelaett
685785eaa3 adding assert_closed function 2023-04-03 11:20:19 +02:00
Denis Merigoux
3c364aa1fa
Progress on linting, bugguy unused field detection 2023-03-30 18:52:29 +02:00
adelaett
9806eb7e0f format for program 2023-03-23 13:46:17 +01:00
adelaett
9a34ee95b1 equality program 2023-03-17 17:20:35 +01:00
adelaett
850a1fdb56 more optimization on fold 2023-03-17 11:34:52 +01:00
Louis Gesbert
c3af0b4097 Toplevel definitions: branch cleanup
- fix remaining warnings (mostly unused arguments)
- renamings throughout for consistency and clarity
2023-02-13 18:02:09 +01:00
Louis Gesbert
9b0c7583ec Add top-level definitions
Only handled until before scalc at the moment.
2023-02-13 11:43:49 +01:00
Louis Gesbert
4ae392c900 AST refactoring
Many changes got bundled in here and would be too tedious to separate.

Closes #330

See changes in `shared_ast/definitions.ml` to check the main point.

- the biggest change is a modification of the struct and enum types in
  expressions: they are now stored as `Map`s throughout passes, and no longer
  converted to indexed lists after scopelang. Their accessors are also changed,
  and tuples only exist in Lcalc (they're used for closure conversion).

  This implied adding some more information in the contexts, to keep the mapping
  between struct fields and scope output variables. It should also be much more
  robust (no longer relying on assumptions upon different orderings).

- another very pervasive change is more cosmetic: the rewrite of the main AST to
  use inline records, labelling individual subfields.

- moved the checks for correct definitions and accesses of structures from
  `Scope_to_dcalc` to `Typing`

- defining some new shallow iterators in module `Shared_ast.Expr`, and
  factorising a few same-pass rewriting functions accordingly (closure
  conversion, optimisations, etc.)

- some smaller style improvements (ensuring we use the proper compare/equal
  functions instead of `=` in a few `when` closes, for example)
2022-11-17 18:16:09 +01:00
Louis Gesbert
e10771c187
Make all supertypes use ('a, 't) gexpr as parameter instead of naked_gexpr 2022-08-29 10:57:21 +02:00
Louis Gesbert
493b6703a7
Rename marked_gexpr -> gexpr, gexpr -> naked_gexpr
Since the marked kind is used throughout, this should be more clear
2022-08-29 10:57:21 +02:00
Louis Gesbert
4caf828e48 Additional cleanup/fixes on the compiler refactoring
following review ^^
2022-08-23 00:13:02 +02:00
Louis Gesbert
ae2801be6d Move mode handling code from dcalc to shared_ast
Handling code should now be reasonably well sorted between `Shared_ast.{Var,Expr,Scope,Program}`

The function parameters (e.g. `make_let_in`) could be removed from the
scope handling functions since now the types are compatible, which
makes them much easier to read.
2022-08-22 19:28:27 +02:00
Louis Gesbert
8e7f65d204 Split Shared_ast.Expr of scope and program functions 2022-08-22 19:28:27 +02:00