This includes a few separate changes:
- pass visibility information of declarations (depending on wether the
declaration was in a ```catala-metadata block or not)
- add reasonable hash computation functions to discriminate the interfaces. In
particular:
* Uids have a `hash` function that depends on their string, but not on their
actual uid (which is not stable between runs of the compiler) ; the existing
`hash` function and its uses have been renamed to `id`.
* The `Hash` module provides the tools to properly combine hashes, etc. While
we rely on `Hashtbl.hash` for the atoms, we take care not to use it on any
recursive structure (it relies on a bounded traversal).
- insert the hashes in the artefacts, and properly check and report those (for
OCaml)
**Remains to do**:
- Record and check the hashes in the other backends
- Provide a way to get stable inline-test outputs in the presence of module
hashes
- Provide a way to write external modules that don't break at every Catala
update.
Positions within the Default terms are specially important since they can come
from separate definitions in the source (before this, we would be falling back
to the single declaration).
- Clearly distinguish Exceptions from Errors. The only catchable exception
available in our AST is `EmptyError`, so the corresponding nodes are made less
generic, and a node `FatalError` is added
- Runtime errors are defined as a specific type in the OCaml runtime, with a
carrier exception and printing functions. These are used throughout, and
consistently by the interpreter. They always carry a position, that can be
converted to be printed with the fancy compiler location printer, or in a
simpler way from the backends.
- All operators that might be subject to an error take a position as argument,
in order to print an informative message without relying on backtraces from
the backend
Closes#592
A new node is added in `desugared`, and translated into an exploded structure
literal during translation to `scopelang`. The main reason to put it there is
that it needs to be after disambiguation, since that is used to discover the
type of the structure that is being updated.
Ensuring messages don't print overlong lines still requires some manual work:
- if they don't contain any `Format` directives (`%` or `@`), use `"%a"
Format.pp_print_text` to turn word-wrapping on.
- otherwise replace spaces with `@ ` to mark possible cutting points, as soon
that it's possible the line will get over 80 chars (most often, this means
starting before the first `%a`)
Thanks @denismerigoux
This renames the "ScopeDef" variant from `SubScope` to `SubScopeInput`, which is
much clearer and avoids confusion with the `SubScope` elements in the surface
AST (which are really subscopes and not variables at this point).
And improves some error message by specialising depending on whether we are
dealing with a subscope or an explicit structure.
Lots of tests have a new warning because they were calling subscopes without
using their outputs. A better solution could be to mark these subscopes as
`output`, now that it's possible !
They are to become citizens of the same class if we want to allow
output-subscopes (without unnecessary complications like deconstructing and
reconstructing the same structure). And it's reasonable to assume that they
share the same namespace.
With this we should shortly collapse the (internal) ambiguity between
- `subscope.subvar`: access to a variable within a subscope
- `subscope.subfield`: access to a field of the output structure contained in a
subscope variable
With the subscope a variable, these should now become strictly equivalent, so
the plan is that the first could be removed.
This allows for retyping after monomorphisation: a new function just extracts
the return type of the operator, without checking the operand types.
Also to avoid multiplying function arguments around the typer, the flags have
been gathered in a record that is included in the typing environment; it's ok to
give them default values as long as these are the strictest.
As part of making tuples first-class citizens, expliciting the arity upon
function application was needed (so that a function of two args can
transparently -- in the surface language -- be applied to either two arguments
or a pair).
It was decided to actually explicit the whole type of arguments because the cost
is the same, and this is consistent with lambda definitions.
A related change done here is the replacement of the `EOp` node for operators by
an "operator application" `EAppOp` node, enforcing a pervasive invariant that
operators are always directly applied. This makes matches terser, and highlights
the fact that the treatment of operator application is almost always different
from function application in practice.
This changes the `decl_ctx` to be toplevel only, with flattened references to
uids for most elements. The module hierarchy, which is still useful in a few
places, is kept separately.
Module names are also changed to UIDs early on, and support for module aliases
has been added (needs testing).
This resolves some issues with lookup, and should be much more robust, as well
as more convenient for most lookups.
The `decl_ctx` was also extended for string ident lookups, which avoids having
to keep the desugared resolution structure available throughout the compilation
chain.
The way nested priorities are encoded use `< < excs | true :- nested > :- x >`,
which imply that `nested` can actually be ∅ ; to cope with this, the typing of
default terms is made more generic (the return type is now the same as the
`cons` type `'a`, rather than `<'a>`). For the general case, we add an explicit
`EPureDefault` node which just encapsulates its argument (a `return`, in monad
terminology).
rather than scattered in structures
The context is still hierarchical for defs though, so one needs to retrieve the
path to lookup in the correct context for info. Exceptions are enums and struct
defs, which are re-exposed at toplevel.
... and add a custom printer
Since this is a very common bug, this patch should gain us a lot of time when
debugging uncaught Not_found errors, because the element not found can now be
printed straight away without the need for further debugging.
The small cost is that one should remember to catch the correct specialised
`Foo.Map.Not_found _` exception rather than the standard `Not_found` (which
would type-check but not catch the exception). Using `find_opt` should be
preferred anyway.
Note that the other functions from the module `Map` that raise `Not_found` are
not affected ; these functions are `choose`, `min/max_binding`,
`find_first/last` which either take a predicate or fail on the empty map, so it
wouldn't make sense for them (and we probably don't use them much).
- Use separate functions for successive passes in module `Driver.Passes`
- Use other functions for end results printing in module `Driver.Commands`
As a consequence, it is much more flexible to use by plugins or libs and we no
longer need the complex polymorphic variant parameter.
This patch leverages previous changes to use Cmdliner subcommands and
effectively specialises the flags of each Catala subcommand.
Other changes include:
- an attempt to normalise the generic options and reduce the number of global
references. Some are ok, like `debug` ; some would better be further cleaned up,
e.g. the ones used by Proof backend were moved to a `Proof.globals` module and
need discussion. The printer no longer relies on the global languages and prints
money amounts in an agnostic way.
- the plugin directory is automatically guessed and loaded even in dev setups.
Plugins are shown by the main `catala` command and listed in `catala --help`
- exception catching at the toplevel has been refactored a bit as well; return
codes are normalised to follow the manpage and avoid codes >= 128 that are
generally reserved for shells.
Update tests