mirror of
https://github.com/CatalaLang/catala.git
synced 2024-09-19 16:28:12 +03:00
Document adding new languages
This commit is contained in:
parent
dfb358993c
commit
e7ad186bd7
@ -104,9 +104,10 @@ need more, here is how one can be added:
|
|||||||
for scope parameters, variables or structure fields, since it won't compile
|
for scope parameters, variables or structure fields, since it won't compile
|
||||||
anymore.
|
anymore.
|
||||||
- Add an element to the `builtin_expression` type in `surface/ast.ml(i)`
|
- Add an element to the `builtin_expression` type in `surface/ast.ml(i)`
|
||||||
- Add your builtin in the `builtins` list in `surface/lexer.cppo.ml`, and with proper
|
- Add your builtin in the `builtins` list in `surface/lexer.cppo.ml`, and with
|
||||||
translations in all of the language-specific modules `surface/lexer_en.cppo.ml`,
|
proper translations in all of the language-specific modules
|
||||||
`surface/lexer_fr.cppo.ml`, etc.
|
`surface/lexer_en.cppo.ml`, `surface/lexer_fr.cppo.ml`, etc. Don't forget the
|
||||||
|
macro at the beginning of `lexer.cppo.ml`.
|
||||||
- The rest can all be done by following the type errors downstream:
|
- The rest can all be done by following the type errors downstream:
|
||||||
- Add a corresponding element to the lower-level AST in `dcalc/ast.ml(i)`, type `unop`
|
- Add a corresponding element to the lower-level AST in `dcalc/ast.ml(i)`, type `unop`
|
||||||
- Extend the translation accordingly in `surface/desugaring.ml`
|
- Extend the translation accordingly in `surface/desugaring.ml`
|
||||||
@ -123,11 +124,40 @@ The Catala language should be adapted to any legislative text that follows a
|
|||||||
general-to-specifics statutes order. Therefore, there exists multiple versions
|
general-to-specifics statutes order. Therefore, there exists multiple versions
|
||||||
of the Catala surface syntax, adapted to the language of the legislative text.
|
of the Catala surface syntax, adapted to the language of the legislative text.
|
||||||
|
|
||||||
Currently, Catala supports English and French legislative text via the
|
Currently, Catala supports English, French and Polish legislative text via the
|
||||||
`--language=en`, `--language=fr` or `--language=pl` option.
|
`--language=en`, `--language=fr` or `--language=pl` options.
|
||||||
|
|
||||||
Technically, support for new languages can be added via a new lexer. If you want
|
To add support for a new language:
|
||||||
to add a new language, you can start from
|
- the basic syntax localisation is defined in
|
||||||
[existing lexer examples](compiler/surface/lexer_fr.ml), tweak and open
|
`compiler/surface/lexer_xx.cppo.ml` where `xx` is the language code (`en`,
|
||||||
a pull request. If you don't feel familiar enough with OCaml to do so, please
|
`fr`...)
|
||||||
leave an issue on this repository.
|
- copy the files from another language, e.g.
|
||||||
|
[english](compiler/surface/lexer_en.cppo.ml), then replace the strings with your
|
||||||
|
translations. Be careful with the following:
|
||||||
|
- The file must be encoded in latin-1
|
||||||
|
- For a given token `FOO`, define `MS_FOO` to be the string version of the
|
||||||
|
keyword. Due to the encoding, use `\xNN` [escape
|
||||||
|
sequences](https://ocaml.org/manual/lex.html#escape-sequence) for utf8
|
||||||
|
characters.
|
||||||
|
- If the string contains spaces or non-latin1 characters, you need to define
|
||||||
|
`MR_FOO` as well with a regular expression in [sedlex
|
||||||
|
format](https://github.com/ocaml-community/sedlex#lexer-specifications).
|
||||||
|
Replace spaces with `", space_plus, "`, and unicode characters with `",
|
||||||
|
0xNNNN, "` where `NNNN` is the hexadecimal unicode codepoint.
|
||||||
|
|
||||||
|
**Hint:** You may get syntax errors with unhelpful locations because of
|
||||||
|
`sedlex`. In that case the command `ocamlc
|
||||||
|
_build/default/compiler/surface/lexer_xx.ml` may point you to the source of the
|
||||||
|
error.
|
||||||
|
- add your translation to the compilation rules:
|
||||||
|
- in `compiler/surface/dune`, copying another `parser_xx.cppo.ml` rule
|
||||||
|
- in the `extensions` list in `compiler/driver.ml`
|
||||||
|
- add a corresponding variant to `compiler/utils/cli.ml` `backend_lang`, try
|
||||||
|
to run `make build` and follow all type errors and `match non exhaustive`
|
||||||
|
warnings to be sure it is well handled everywhere.
|
||||||
|
- you may want to add syntax highlighting support, see `syntax_highlighting/`
|
||||||
|
and the rules in `Makefile`
|
||||||
|
- add examples and documentation!
|
||||||
|
|
||||||
|
Feel free to open a pull request for discussion even if you couldn't go through
|
||||||
|
all these steps, the `lexer_xx.cppo.ml` file is the important part.
|
||||||
|
Loading…
Reference in New Issue
Block a user