mirror of
https://github.com/thufschmitt/tix-papers.git
synced 2024-09-17 15:57:11 +03:00
167 lines
6.4 KiB
TeX
167 lines
6.4 KiB
TeX
\input{common/header}
|
||
|
||
\title{Typing of records in nix}
|
||
|
||
\begin{document}
|
||
|
||
\maketitle{}
|
||
|
||
We assume a record language, whose grammar is defined as follows:
|
||
|
||
\begin{grammar}
|
||
<e> ::= \{ \e/ = \e/ ; \} \| \{\}
|
||
\| \e/ + \e/
|
||
\| \e/ $\orthplus{}$ \e/
|
||
\| \e/ \textbackslash{} \e/
|
||
\| \e/.\e/
|
||
\| s
|
||
\end{grammar}
|
||
|
||
Where:
|
||
\begin{itemize}
|
||
\item $+$ denotes the merge of two records (with precedence on the right if
|
||
the same field exists in both records),
|
||
\item $\orthplus{}$ denotes the merge of two records, assuming no field is
|
||
defined in both,
|
||
\item \textbackslash{} denotes the removal of a field (\emph{ie} $e_1
|
||
\backslash e_2$ is the record $e_1$ whose field $s$ is undefined, where $s$
|
||
is the value of $e_2$).
|
||
\item $e_1.e_2$ denots the access to the field $s$ of the record $e_2$,
|
||
where $s$ is the value of $e_2$.
|
||
\item $s$ range over the set of allowed value labels (which here will be \ty{string})
|
||
\end{itemize}
|
||
|
||
\section{Static labels}
|
||
\subsection{Typing in a CBV setting}
|
||
The typing of records in a CBV language is already given (in absence of
|
||
polymorphism) by the formalism given in Section 4.5 of~\cite{Cas15} (which itself
|
||
is the reformulation of the formalism presented by Alain Frisch in its Phd
|
||
Thesis − see~\cite{Fri04}).
|
||
|
||
In the original formalism, records (resp.\ record types) are interpreted as
|
||
\emph{quasi-constant} functions from \ty{string} to $\textbf{Values} \cup
|
||
\undef$ (resp.\ from \ty{string} to $\textbf{Types} \cup \undef$), where
|
||
\begin{itemize}
|
||
\item A \emph{quasi-constant} function from \set{L} to \set{Z} is a
|
||
function $r: \set{L} \rightarrow \set{Z}$ which is constant to an element
|
||
$z \in \set{Z}$, except for a finite set $\dom(r)$. We note $\set{L}
|
||
\quasiconst{z} \set{Z}$ for the set of functions from \set{L} to \set{Z}
|
||
quasi-constant to $z$.
|
||
\item \textbf{Values} denotes the set of values in the language.
|
||
\item \textbf{Types} denotes the set of values in the language.
|
||
\item $\undef$ is a distinguished constant that represents an undefined
|
||
field.
|
||
\end{itemize}
|
||
|
||
(in this formalism, the constant $\undef$ was called $\bot$, we renamed it here
|
||
in order to avoid confusion with the type $\bot$ representing an undefined
|
||
computation).
|
||
|
||
\subsection{Adaptation to a call-by-name semantic}
|
||
|
||
As long as we do not allow anything but constant strings as labels, this
|
||
formalism requires only a few modifications in order to be useful in a CBN
|
||
setting. In fact it is sufficient (I think) to interpret a record as an element
|
||
of $\ty{string} \quasiconst{\undef} \left(\textbf{Expressions} \cup
|
||
\undef\right)$ (instead of $\ty{string} \quasiconst{\undef}
|
||
\left(\textbf{Values} \cup \undef\right)$).
|
||
|
||
\section{Dynamic labels}
|
||
|
||
In nix, the labels can be not only static strings, but also arbitrary
|
||
expressions (whose value cannott by consequence be statically known in the
|
||
general case), which makes the typing more complicated and way less accurate.
|
||
|
||
One problem raised by this is the semantic of labels evaluation (ie: when do we
|
||
evaluate the labels?). When the labels were static, this was not a problem at
|
||
all as evaluation of labels was a no-op
|
||
|
||
The most logical choice seems to evaluate every label eagerly as soon as we
|
||
need the value of one of them. This entails that record values will have the
|
||
form \texttt{\bfseries \{ $s$ = $e$; \ldots{}; $s$ = $e$; \}} where $s$ designs
|
||
an evaluated label (a string value). The semantic is then the one given in the
|
||
related \texttt{semantics} paper.
|
||
|
||
The consequence is that, for example, an ill-formed record (with a field
|
||
defined twice) will not be detected until it is used (one try to access or test
|
||
the existence of one of the fields), while with static fields the bad formation
|
||
of the record can be detected at the time of the definition, even in a lazy
|
||
setting.
|
||
|
||
Nix mixes both approaches by detecting a collision between static fields at
|
||
definition place and between dynamic fields (or between a static and a dynamic
|
||
one) at use-time. This approach is rather inconsistent (even more if we decide
|
||
to consider a static label $s$ as syntactic sugar for the dynamic label whose
|
||
expression is the constant string ``$s$''), so probably will not be reflected in
|
||
the type system\footnote{It could nonethless be implemented as a pre-treatment
|
||
of the AST in order to match nix's behaviour}.
|
||
|
||
\section{Orthogonal merge}
|
||
|
||
In CDuce, as in perl6 (the contexts into which respectivly~\cite{Fri04}
|
||
and~\cite{Cas15} have been written), a record can be defined with the same field
|
||
appearing twice, in which case the second appearance takes precedence over the
|
||
first. This led to the definition of the $\oplus_t$ (\emph{merge} with
|
||
respect to the value $t$) operator between two records.
|
||
|
||
In nix, such a definition would be invalid − the language requires all
|
||
fields to be distinct − so we cannot use this $\oplus_t$ operator to
|
||
define records and their typing. That's why we introduced the $\orthplus$
|
||
operator, so that the litteral expression
|
||
|
||
\begin{lstlisting}
|
||
{
|
||
x1 = e1;
|
||
...;
|
||
xn = en;
|
||
}
|
||
\end{lstlisting}
|
||
|
||
is syntactic sugar for
|
||
|
||
\begin{lstlisting}
|
||
{ x1 = e1; } //*$\orthplus$*// ... //*$\orthplus$*// { xn = en; }
|
||
\end{lstlisting}
|
||
|
||
We also define a $\orthsum_t$ (orthogonal merge) operator between record type
|
||
atoms defined as:
|
||
|
||
\[
|
||
T_1 \orthsum_t T_2 = T_1 \oplus_t T_2
|
||
\text{ if } \dom_t(T_1) \wedge \dom_t(T_2) = \varnothing
|
||
\]
|
||
|
||
where for a record type atom $T$, $\dom_t(T)$ is defined as the set
|
||
of all elements of \ty{string} whose image by $T$ is different from $t$.
|
||
|
||
We note $T_1 \orthplus T_2$ for $T_1 \orthsum_\undef T_2$
|
||
|
||
\section{Typing}
|
||
\subsection{Subtyping relation}
|
||
|
||
We first extend the $\subtype$ relation to a relation on $\ty{string}
|
||
\rightarrow \textbf{Types}$ (defined as the pointwise relation: $\forall r_1,
|
||
r_2 \in \ty{string} \rightarrow \textbf{Types}, r_1 \subtype r_2
|
||
\Leftrightarrow \left( \forall s \in \ty{string}, r_1(s) \subtype r_2(s)
|
||
\right)$).
|
||
|
||
This relation defines the subtyping for record types. For example, for two
|
||
closed record types:
|
||
|
||
if $r_1 = \{ l_1 = v_1; \ldots{}; l_n = v_n; \}$
|
||
and $r_2 = \{ k_1 = w_1; \ldots{}; k_m = w_m; \}$
|
||
then
|
||
$r_1 \subtype r_2$ if and only if $n = m$ and there exists a permutation $\σ$
|
||
of $\left\{1, \ldots{}, n\right\}$ such that for all $i \in \left\{1, \ldots{},
|
||
n\right\}$, $l_i = k_{\σ(i)}$ and $v_i \subtype w_{\σ(i)}$.
|
||
|
||
\subsection{Typing rules}
|
||
\subsubsection{Dynamic labels}
|
||
\begin{mathpar}
|
||
\input{typing/recordTypingRules}
|
||
\end{mathpar}
|
||
|
||
\bibliographystyle{alpha}
|
||
\bibliography{../references}
|
||
\end{document}
|