mirror of
https://github.com/CatalaLang/catala.git
synced 2024-11-14 03:14:55 +03:00
391 lines
15 KiB
Plaintext
391 lines
15 KiB
Plaintext
@@The Catala language tutorial@@
|
|
|
|
Welcome to this tutorial, whose objective is to guide you through the features
|
|
of the Catala language and teach you how to annotate a legislative text using
|
|
the language. This document is addressed primarily to developers or people that
|
|
have a programming background, though tech-savvy lawyers will probably figure
|
|
things out.
|
|
|
|
@@Literate programming@@+
|
|
|
|
To begin writing a Catala program, you must start from the text of the
|
|
legislative source that will justify the code that you will write. Concretely,
|
|
that means copy-pasting the text of the law into a Catala source file and
|
|
formatting it according so that Catala can understand it. Catala source files
|
|
have the ".catala_en" extension. If you were to write a Catala program for a
|
|
French law, you would use the ".catala_fr" extension.
|
|
|
|
You can write any kind of plain text in Catala, and it will be printed as is
|
|
in PDF or HTML output. You can split your text into short lines, those
|
|
will appear as a single paragraph in the output. If you want to create a
|
|
new paragrah, you have to leave a blank line in the source.
|
|
|
|
Catala allows you to declare section or subsection headers as it is done
|
|
here, with the "at" symbol repeated twice. You can define heading of lower
|
|
importance by adding increasing numbers of "+" after the title of the heading.
|
|
|
|
The fundamental division unit is the article, introduced by a single "at".
|
|
Let's analyse a fictional example that defines an income tax.
|
|
|
|
@Article 1@
|
|
The income tax for an individual is defined as a fixed percentage of the
|
|
individual's income over a year.
|
|
|
|
/*
|
|
# Welcome to the code mode of Catala. This is a comment, because the line is
|
|
# prefixed by #.
|
|
# We will soon learn what to write here in order to translate the meaning
|
|
# of the article into Catala code.
|
|
*/
|
|
|
|
To do that, we will intertwine short snippets of code between the sentences of
|
|
the legislative text. Each snippet of code should be as short as possible and
|
|
as close as possible to the actual sentence that justifies the code. This style
|
|
is called literate programming, a programming paradigm invented by the famous
|
|
computer scientist Donald Knuth in the 70s.
|
|
|
|
@@Defining a fictional income tax@@+
|
|
|
|
The content of article 1 uses a lot of implicit context: there exists an
|
|
individual with an income, as well as an income tax that the individual has
|
|
to pay each year. Even if this implicit context is not verbatim in the law,
|
|
we have to explicit it for programming purposes. Concretely, we need a
|
|
"metadata" section that defines the shape and types of the data used
|
|
inside the law.
|
|
|
|
Let's start our metadata section by declaring the type information for the
|
|
individual:
|
|
|
|
@@Begin metadata@@
|
|
/*
|
|
declaration structure Individual:
|
|
# The name of the structure "Individual", must start with an
|
|
# uppercase letter: this is the CamelCase convention.
|
|
data income content money
|
|
# In this line, "income" is the name of the structure field and
|
|
# "money" is the type of what is stored in that field.
|
|
# Available types include: integer, decimal, money, date, duration,
|
|
# and any other structure or enumeration that you declare
|
|
data number_of_children content integer
|
|
# "income" and "number_of_children" start by a lowercase letter,
|
|
# they follow the snake_case convention
|
|
*/
|
|
@@End metadata@@
|
|
|
|
This structure contains two data fields, "income" and "number_of_children". Structures are
|
|
useful to group together data that goes together. Usually, you
|
|
get one structure per concrete object on which the law applies (like the
|
|
individual). It is up to you to decide how to group the data together,
|
|
but you should aim to optimize code readability.
|
|
|
|
Sometimes, the law gives an enumeration of different situations. These
|
|
enumerations are modeled in Catala using an enumeration type, like:
|
|
@@Begin metadata@@
|
|
/*
|
|
declaration enumeration TaxCredit:
|
|
# The name "TaxCredit" is also written in CamelCase
|
|
-- NoTaxCredit
|
|
# This line says that "TaxCredit" can be a "NoTaxCredit" situation
|
|
-- ChildrenTaxCredit content integer
|
|
# This line says that alternatively, "TaxCredit" can be a
|
|
# "ChildrenTaxCredit" situation. This situation carries a content
|
|
# of type integer corresponding to the number of children concerned
|
|
# by the tax credit. This means that if you're in the "ChildrenTaxCredit"
|
|
# situation, you will also have access to this number of children
|
|
*/
|
|
@@End metadata@@
|
|
|
|
In computer science terms, such an enumeration is called a "sum type" or simply
|
|
an enum. The combination of structures and enumerations allow the Catala
|
|
programmer to declare all possible shapes of data, as they are equivalent to
|
|
the powerful notion of "algebraic datatypes".
|
|
|
|
We've defined and typed the data that the program will manipulate. Now we have
|
|
to define the logical context in which these data will evolve. This is done in
|
|
Catala using "scopes". Scopes are close to functions in terms of traditional
|
|
programming. Scopes also have to be declared in metadata, so here we go:
|
|
|
|
@@Begin metadata@@
|
|
/*
|
|
declaration scope IncomeTaxComputation:
|
|
# Scope names use CamelCase
|
|
context individual content Individual
|
|
# This line declares a context element of the scope, which is akin to
|
|
# a function parameter in computer science term. This is the piece of
|
|
# data on which the scope will operate
|
|
context fixed_percentage content decimal
|
|
context income_tax content money
|
|
*/
|
|
@@End metadata@@
|
|
|
|
We now have everything to annotate the contents of article 1, which is copied
|
|
over below.
|
|
|
|
@Article 1@
|
|
The income tax for an individual is defined as a fixed percentage of the
|
|
individual's income over a year.
|
|
/*
|
|
scope IncomeTaxComputation:
|
|
definition income_tax equals
|
|
individual.income *$ fixed_percentage
|
|
*/
|
|
|
|
In the code, we are defining inside our scope the amount of the income tax
|
|
according to the formula described in the article. When defining formulae,
|
|
you have access to all the usual arithmetic operators: addition "+",
|
|
substraction "-", multiplication "*" and division (slash).
|
|
|
|
However, in the Catala code, you can see that we use "*$" to multiply the
|
|
individual income by the fixed percentage. The $ suffix indicates that we
|
|
are performing a multiplication on an amount of money. Indeed, in Catala,
|
|
you have to keep track of what you are dealing with: is it money ? Is it
|
|
an integer? Using just "+" or "*" can be ambiguous in terms of rounding,
|
|
since money is usually rounded at the cent. So to disambiguate, we suffix these
|
|
operations with something that indicates the type of what we manipulate.
|
|
The suffixes are "$" for money "." for decimals, "at" (like in email adresses)
|
|
for dates and the hat symbol for durations. If you forget the suffix, the Catala type
|
|
checker will display an error message that will help you put it where it
|
|
belongs.
|
|
|
|
But inside article 1, one question remains unknown: what is the value of
|
|
of the fixed percentage? Often, precise values are defined elsewhere in the
|
|
legislative source. Here, let's suppose we have:
|
|
|
|
@Article 2@
|
|
The fixed percentage mentionned at article 1 is equal to 20 %.
|
|
/*
|
|
scope IncomeTaxComputation:
|
|
definition fixed_percentage equals 20 %
|
|
# Writing 20% is just an abbreviation for 0.20
|
|
*/
|
|
|
|
You can see here that Catala allows definitions to be scattered throughout
|
|
the annotation of the legislative text, so that each
|
|
definition is as close as possible to its location in the text.
|
|
|
|
@@Conditional definitions@@+
|
|
|
|
So far so good, but now the legislative text introduces some trickiness. Let us
|
|
suppose the third article says:
|
|
|
|
@Article 3@ If the individual is in charge of 2 or more children, then the fixed
|
|
percentage mentionned at article 1 is equal to 15 %.
|
|
/*
|
|
# How to redefine fixed_percentage?
|
|
*/
|
|
|
|
This article actually gives another definition for the fixed percentage, which
|
|
was already defined in article 2. However, article 3 defines the percentage
|
|
conditionnally to the individual having more than 2 children. Catala allows
|
|
you precisely to redefine a variable under a condition:
|
|
|
|
/*
|
|
scope IncomeTaxComputation:
|
|
definition fixed_percentage under condition
|
|
individual.number_of_children >= 2
|
|
consequence equals 15 %
|
|
# Writing 15% is just an abbreviation for 0.15
|
|
*/
|
|
|
|
When the Catala program will execute, the right definition will be dynamically
|
|
chosen by looking at which condition is true. A correctly drafted legislative
|
|
source should always ensure that at most one condition is true at all times.
|
|
However, if it is not the case, Catala will let you define a precedence on the
|
|
conditions, which has to be justified by the law.
|
|
|
|
|
|
@@Functions@@+
|
|
|
|
Catala lets you define functions anywhere in your data. Here's what it looks
|
|
like in the metadata definition when we want to define a two-brackets tax
|
|
computation:
|
|
@@Begin metadata@@
|
|
/*
|
|
declaration structure TwoBrackets:
|
|
data breakpoint content money
|
|
data rate1 content decimal
|
|
data rate2 content decimal
|
|
|
|
declaration scope TwoBracketsTaxComputation :
|
|
context brackets content TwoBrackets
|
|
context tax_formula content money depends on money
|
|
*/
|
|
@@End metadata@@
|
|
|
|
And in the code:
|
|
|
|
@Article4@ The tax amount for a two-brackets computation is equal to the amount
|
|
of income in each bracket multiplied by the rate of each bracket.
|
|
|
|
/*
|
|
scope TwoBracketsTaxComputation :
|
|
definition tax_formula of income equals
|
|
if income <=$ brackets.breakpoint then
|
|
income *$ brackets.rate1
|
|
else (
|
|
brackets.breakpoint *$ brackets.rate1 +$
|
|
(income -$ brackets.breakpoint) *$ brackets.rate2
|
|
)
|
|
*/
|
|
|
|
@@Scope inclusion@@+
|
|
|
|
Now that we've defined our helper scope for computing a two-brackets tax, we
|
|
want to use it in our main tax computation scope.
|
|
|
|
@Article 5@ For individuals whose income is greater than $100,000, the income
|
|
tax of article 1 is 40% of the income above $100,000. Below $100,000, the
|
|
income tax is 20% of the income.
|
|
/*
|
|
declaration scope NewIncomeTaxComputation:
|
|
context two_brackets scope TwoBracketsTaxComputation
|
|
# This line says that we add the item two_brackets to the context.
|
|
# However, the "scope" keyword tells that this item is not a piece of data
|
|
# but rather a subscope that we can use to compute things.
|
|
context individual content Individual
|
|
context income_tax content money
|
|
|
|
scope NewIncomeTaxComputation :
|
|
definition two_brackets.brackets equals TwoBrackets {
|
|
-- breakpoint: $100,000
|
|
-- rate1: 20%
|
|
-- rate2: 40%
|
|
}
|
|
definition income_tax equals two_brackets.tax_formula of individual.income
|
|
*/
|
|
|
|
@Article 6@
|
|
Individuals earning less than $10,000 are exempted of the income tax mentionned
|
|
at article 1.
|
|
/*
|
|
scope NewIncomeTaxComputation:
|
|
definition income_tax under condition
|
|
individual.income <=$ $10,000
|
|
consequence equals $0
|
|
*/
|
|
|
|
That's it! We've defined a two-brackets tax computation simply by annotating
|
|
legislative article by snippets of Catala code. However, attentive readers
|
|
may have caught something weird in articles 5 and 6. What happens when the
|
|
income of the individual is between $10,000 and $100,000 ?
|
|
|
|
The law leaves it unspecified ; our dummy articles are clearly badly drafted.
|
|
But Catala can help you find this sort of errors via simple testing or
|
|
even formal verification. Let's start with the testing.
|
|
|
|
@@Testing Catala programs@@+
|
|
|
|
Testing Catala programs can be done directly into Catala. Indeed, writing test
|
|
cases for each Catala scope that you define is a good practice called
|
|
"unit testing" in the software engineering community. A test case is defined
|
|
as another scope:
|
|
|
|
@Testing NewIncomeTaxComputation@
|
|
/*
|
|
declaration scope Test1:
|
|
context tax_computation scope NewIncomeTaxComputation
|
|
context income_tax content money
|
|
|
|
scope Test1:
|
|
definition
|
|
tax_computation.individual
|
|
# We define the argument to the subscope
|
|
equals
|
|
# The four lines below define a whole structure by giving a value to
|
|
# each of its fields
|
|
Individual {
|
|
-- income: $230,000
|
|
-- number_of_children: 0
|
|
}
|
|
|
|
definition income_tax equals tax_computation.income_tax
|
|
# Next, we retrieve the income tax value compute it by the subscope and
|
|
# assert that it is equal to the expected value :
|
|
# ($230,000-$100,00)*40%+$100,000*20% = $72,000
|
|
assertion income_tax = $72,000
|
|
*/
|
|
|
|
This test should pass. Let us now consider a failing test case:
|
|
/*
|
|
declaration scope Test2:
|
|
context tax_computation scope NewIncomeTaxComputation
|
|
context income_tax content money
|
|
|
|
scope Test2:
|
|
definition tax_computation.individual equals Individual {
|
|
-- income: $4,000
|
|
-- number_of_children: 0
|
|
}
|
|
|
|
definition income_tax equals tax_computation.income_tax
|
|
assertion income_tax = $0
|
|
*/
|
|
|
|
This test case should compute a $0 income tax because of Article 6. But instead,
|
|
execution will yield an error saying that there is a conflict between rules.
|
|
|
|
@@Defining exceptions to rules@@+
|
|
|
|
Indeed, the definition of the income tax in article 6 conflicts with the
|
|
definition of income tax in article 5. But actually, article 6 is just an
|
|
exception of article 5. In the law, it is implicit that if article 6 is
|
|
applicable, then it takes precedence over article 5.
|
|
|
|
@Fixing the computation@
|
|
|
|
This implicit precedence has to be explicitely declared in Catala. Here is a
|
|
fixed version of the NewIncomeTaxComputation scope:
|
|
|
|
/*
|
|
declaration scope NewIncomeTaxComputationFixed:
|
|
context two_brackets scope TwoBracketsTaxComputation
|
|
context individual content Individual
|
|
context income_tax content money
|
|
|
|
scope NewIncomeTaxComputationFixed :
|
|
definition two_brackets.brackets equals TwoBrackets {
|
|
-- breakpoint: $100,000
|
|
-- rate1: 20%
|
|
-- rate2: 40%
|
|
}
|
|
|
|
# To define an exception to a rule, you have to first label the rule that
|
|
# you want to attach the exception to. You can put any snake_case identifier
|
|
# for the label
|
|
label article_5
|
|
definition income_tax equals two_brackets.tax_formula of individual.income
|
|
|
|
# Then, you can declare the exception by referring back to the label
|
|
exception article_5
|
|
definition income_tax under condition
|
|
individual.income <=$ $10,000
|
|
consequence equals $0
|
|
*/
|
|
|
|
And the test that should now work:
|
|
|
|
/*
|
|
declaration scope Test3:
|
|
context tax_computation scope NewIncomeTaxComputationFixed
|
|
context income_tax content money
|
|
|
|
scope Test3:
|
|
definition tax_computation.individual equals Individual {
|
|
-- income: $4,000
|
|
-- number_of_children: 0
|
|
}
|
|
definition income_tax equals tax_computation.income_tax
|
|
assertion income_tax = $0
|
|
*/
|
|
|
|
@@Conclusion@@+
|
|
|
|
This tutorial presents the basic concepts and syntax of the Catala language
|
|
features. It is then up to you use them to annotate legislative texts
|
|
with their algorithmic translation.
|
|
|
|
There is no single way to write Catala programs, as the program style should be
|
|
adapted to the legislation it annotates. However, Catala is a functional
|
|
language at heart, so following standard functional programming design patterns
|
|
should help achieve concise and readable code.
|