Previously we had some heuristics in the backends trying to achieve this with a
lot of holes ; this should be much more solid, relying on `Bindlib` to do the
correct renamings.
**Note1**: it's not plugged into the backends other than OCaml at the moment.
**Note2**: the related, obsolete heuristics haven't been cleaned out yet
**Note3**: we conservatively suppose a single namespace at the moment. This is
required for e.g. Python, but it forces vars named like struct fields to be
renamed, which is more verbose in e.g. OCaml. The renaming engine could be
improved to support different namespaces, with a way to select how to route the
different kinds of identifiers into them.
Similarly, customisation for what needs to be uppercase or lowercase is not
available yet.
**Note4**: besides excluding keywords, we should also be careful to exclude (or
namespace):
- the idents used in the runtime (e.g. `o_add_int_int`)
- the dynamically generated idents (e.g. `embed_*`)
**Note5**: module names themselves aren't handled yet. The reason is that they
must be discoverable by the user, and even need to match the filenames, etc. In
other words, imagine that `Mod` is a keyword in the target language. You can't
rename a module called `Mod` to `Mod1` without knowing the whole module context,
because that would destroy the mapping for a module already called `Mod1`.
A reliable solution would be to translate all module names to e.g.
`CatalaModule_*`, which we can assume will never conflict with any built-in, and
forbid idents starting with that prefix. We may also want to restrict their
names to ASCII ? Currently we use a projection, but what if I have two modules
called `Là` and `La` ?
- Clearly distinguish Exceptions from Errors. The only catchable exception
available in our AST is `EmptyError`, so the corresponding nodes are made less
generic, and a node `FatalError` is added
- Runtime errors are defined as a specific type in the OCaml runtime, with a
carrier exception and printing functions. These are used throughout, and
consistently by the interpreter. They always carry a position, that can be
converted to be printed with the fancy compiler location printer, or in a
simpler way from the backends.
- All operators that might be subject to an error take a position as argument,
in order to print an informative message without relying on backtraces from
the backend
As part of making tuples first-class citizens, expliciting the arity upon
function application was needed (so that a function of two args can
transparently -- in the surface language -- be applied to either two arguments
or a pair).
It was decided to actually explicit the whole type of arguments because the cost
is the same, and this is consistent with lambda definitions.
A related change done here is the replacement of the `EOp` node for operators by
an "operator application" `EAppOp` node, enforcing a pervasive invariant that
operators are always directly applied. This makes matches terser, and highlights
the fact that the treatment of operator application is almost always different
from function application in practice.
The way nested priorities are encoded use `< < excs | true :- nested > :- x >`,
which imply that `nested` can actually be ∅ ; to cope with this, the typing of
default terms is made more generic (the return type is now the same as the
`cons` type `'a`, rather than `<'a>`). For the general case, we add an explicit
`EPureDefault` node which just encapsulates its argument (a `return`, in monad
terminology).
rather than scattered in structures
The context is still hierarchical for defs though, so one needs to retrieve the
path to lookup in the correct context for info. Exceptions are enums and struct
defs, which are re-exposed at toplevel.
- Use separate functions for successive passes in module `Driver.Passes`
- Use other functions for end results printing in module `Driver.Commands`
As a consequence, it is much more flexible to use by plugins or libs and we no
longer need the complex polymorphic variant parameter.
This patch leverages previous changes to use Cmdliner subcommands and
effectively specialises the flags of each Catala subcommand.
Other changes include:
- an attempt to normalise the generic options and reduce the number of global
references. Some are ok, like `debug` ; some would better be further cleaned up,
e.g. the ones used by Proof backend were moved to a `Proof.globals` module and
need discussion. The printer no longer relies on the global languages and prints
money amounts in an agnostic way.
- the plugin directory is automatically guessed and loaded even in dev setups.
Plugins are shown by the main `catala` command and listed in `catala --help`
- exception catching at the toplevel has been refactored a bit as well; return
codes are normalised to follow the manpage and avoid codes >= 128 that are
generally reserved for shells.
Update tests
(first working dynload test with compilation done by manual calls to ocaml)
A few pieces of the puzzle:
* Loading of interfaces only from Catala files
* Registration of toplevel values in modules compiled to OCaml, to allow access
using dynlink
* Shady conversion from OCaml runtime values to/from Catala expressions, to
allow interop (ffi) of compiled modules and the interpreter
The module is renamed to `Mark`, and functions renamed to avoid redundancy:
`Marked.mark` is now `Mark.add`
`Marked.unmark` is now `Mark.remove`
`Marked.map_under_mark` is now simply `Mark.map`
etc.
`Marked.same_mark_as` is replaced by `Mark.copy`, but with the arguments
swapped (which seemed more convenient throughout)
Since a type `Mark.t` would indicate a mark, and to avoid confusion, the type
`Marked.t` is renamed to `Mark.ed` as a shorthand for `Mark.marked` ; this part
can easily be removed if that's too much quirkiness.
- Fix the printer for scopes
- Improve the printer for struct types
- Remove `Print.expr'`. Use `Expr.format` as the function with simplified arguments instead.
- `Print.expr` no longer needs the context
- This removes the need for `expr ~debug` + `expr_debug` ;
use `Print.expr` for normal (non-debug) output,
and `Print.expr' ?debug ()` for possibly debug output.
- This improves consistency of debug expr output in many places
- Prints simplified operators (without type suffix) in non-verbose mode
(this patch also fixes some cases of `Expr.skip_wrappers` and leverages the
binder equality provided by Bindlib)
The phantom polymorphic variant qualifying AST nodes is reversed:
- previously, we were explicitely restricting each AST node to the passes where it belonged using a closed type (e.g. `[< dcalc | lcalc]`)
- now, each node instead declares the "feature" it provides using an open type (e.g. `[> 'Exceptions ]`)
- then the AST for a specific pass limits the features it allows with a closed type
The result is that you can mix and match all features if you wish,
even if the result is not a valid AST for any given pass. More
interestingly, it's now easier to write a function that works on
different ASTs at once (it's the inferred default if you don't write a
type restriction).
The opportunity was also taken to simplify the encoding of the
operators, which don't need a second type parameter anymore.