From 0a21d34ab70cd21eb5af96b189708b6c1fb5ac96 Mon Sep 17 00:00:00 2001 From: Cheng Shao Date: Thu, 20 Jan 2022 13:20:31 +0000 Subject: [PATCH] WIP --- docs/default.nix | 8 +- docs/src/semantics.md | 114 +------------- docs/src/semantics.tex | 347 +++++++++++++++++++++++++++++++++++++++++ 3 files changed, 355 insertions(+), 114 deletions(-) create mode 100644 docs/src/semantics.tex diff --git a/docs/default.nix b/docs/default.nix index e346e014..a7aacc00 100644 --- a/docs/default.nix +++ b/docs/default.nix @@ -1,4 +1,4 @@ -{ haskell-nix, mdbook, stdenvNoCC }: +{ haskell-nix, mdbook, stdenvNoCC, stix-two, texlive }: stdenvNoCC.mkDerivation { name = "asterius-docs"; src = haskell-nix.haskellLib.cleanGit { @@ -6,8 +6,12 @@ stdenvNoCC.mkDerivation { src = ../.; subDir = "docs"; }; - nativeBuildInputs = [ mdbook ]; + nativeBuildInputs = [ mdbook stix-two texlive.combined.scheme-full ]; buildPhase = '' + pushd $(mktemp -d) + latexmk -pdfxe -file-line-error -halt-on-error $OLDPWD/src/semantics.tex + mv semantics.pdf $OLDPWD/src + popd mdbook build --dest-dir $out ''; dontInstall = true; diff --git a/docs/src/semantics.md b/docs/src/semantics.md index cc516db4..0229a2e6 100644 --- a/docs/src/semantics.md +++ b/docs/src/semantics.md @@ -1,113 +1,3 @@ -# Draft semantics of concurrency and foreign calls. +# Interaction of GHC Runtime with JavaScript Runtime -Note: This document assumes that every function takes exactly one -argument. Just imagine that it's the last argument in a fully -saturated call. - -## Foreign export asynchronous - -Suppose that a Haskell function `f` is exported to JavaScript -asynchronously (which might be the default). When JavaScript calls the -exported function with argument `v`, it has the effect of performing -the IO action `⟦f⟧ v`, where the translation `⟦f⟧` is defined as -follows: - - ⟦f⟧ v = do - p <- allocate new promise - let run_f = case try (return $ f $ jsToHaskell v) of - Left exn -> p.fails (exnToJS exn) - Right a -> p.succeeds (haskellToJS a) - forkIO run_f - return p -- returned to JavaScript - -Not specified here is whether the scheduler is allowed to steal a few -cycles to run previously forked threads. - -N.B. This is just a semantics. We certainly have the option of -implementing the entire action completely in the runtime system. - -Not yet specified: What is the API by which JavaScript would call an -asynchronously exported Haskell function? Would it, for example, use -API functions to construct a Haskell closure, then evaluate it? - -## Foreign import asynchronous - -Suppose that a JavaScript function `g` is imported asynchronously -(which might be the default). Let types `a` and `b` stand for two -unknown but fixed types. The JavaScript function expects an argument -of type `a` and returns a `Promise` that (if successful) eventually -delivers a value of type `b`. When a Haskell thunk of the form `g e` -is forced (evaluated), the machine performs the following monadic -action, the result of which is (eventually) written into the thunk. - - do let v = haskellToJS e -- evaluates e, converts result to JavaScript - p <- g v -- call returns a `Promise`, "immediately" - m <- newEmptyMVar - ... juju to associate m with p ... -- RTS primitive? - result <- takeMVar m - case result of Left fails -> ... raise asynchronous exception ... - Right b -> return $ jsToHaskell v - -## CPU sharing - -Suppose GHC wishes to say politely to the JavaScript engine, "every so -often I would like to use the CPU for a bounded time." It looks like -Haskell would need to add a message to the JavaScript message queue, -such that the function associated with that messages is "run Haskell -for N ticks." Is the right API to call `setTimeout` with a delay of 0 -seconds? - -## Concurrency sketch - -Let's suppose the state of a Haskell machine has these components: - - - `F` ("fuel") is the number of ticks a Haskell thread can execute - before returning control to JavaScript. This component is present - only when Haskell code is running. - - - `R` ("running") is either the currently running Haskell thread, or - if no thread is currently running, it is • ("nothing") - - - `Q` ("run queue") is a collection of runnable threads. - - - `H` ("heap") is the Haskell heap, which may contain `MVar`s and - threads that are blocked on them. - -Components `R` and `H` are used linearly, so they can be stored in -global mutable state. - -The machine will enjoy a set of labeled transitions such as are -described in Simon PJ's paper on the "Awkward Squad." Call these the -"standard transitions." (The awkward-squad machine state is a single -term, formed by the parallel composition of `R` with all the threads -of `Q` and all the MVars of `H`. The awkward squad doesn't care about -order, but we do.) To specify the standard transitions, we could add -an additional clock that tells the machine when to switch the running -thread `R` out for a new thread from the queue. Or we could leave the -context switch nondeterministic, as it is in the awkward-squad paper. -Whatever seems useful. - -Every state transition has the potential use to fuel. Fuel might -actually be implemented using an allocation clock, but for semantics -purposes, we can simply decrement fuel at each state transition, then -gate the standard transitions on the condition `F > 0`. - -At a high level, every invocation of Haskell looks the same: -JavaScript starts the Haskell machine in a state `⟨F, •, Q, H⟩`, and -the Haskell machine makes repeated state transitions until it reaches -one of two stopping states: - - - `⟨F',•, [], H'⟩`: no Haskell threads are left to run - - - `⟨0, ̧R', Q', H'⟩`: fuel is exhausted, in which case the machine - moves the currently running thread onto the run queue, reaching - state `⟨0, ̧•, R':Q', H'⟩` - -Once one of these states is reached, GHC's runtime system takes two -actions: - - 1. It allocates a polite request for the CPU and puts that request on - the JavaScript message queue, probably using `setTimeout` with a - delay of 0 seconds. - - 2. It returns control to JavaScript. +See rendered [pdf](semantics.pdf) for details. diff --git a/docs/src/semantics.tex b/docs/src/semantics.tex new file mode 100644 index 00000000..d8fcdcfc --- /dev/null +++ b/docs/src/semantics.tex @@ -0,0 +1,347 @@ +\documentclass{article} +\usepackage[top=0.2in,bottom=0.3in]{geometry} +\usepackage{amsmath} + +\makeatletter +\newcommand{\mono}[1]{% Thanks Paul A. (Windfall Software) + {\@tempdima = \fontdimen2\font + \frenchspacing + \texttt{\spaceskip = 1.1\@tempdima{}#1}}} +\newcommand\xmono[2][1.1] + {\@tempdima = \fontdimen2\font + \frenchspacing + \texttt{\spaceskip = #1\@tempdima{}#2}} +\makeatother +\newcommand*\monobox[2][1.1]{\mbox{\upshape\xmono[#1]{#2}}} + + +% ★ +\newcommand\slicemessage{\mbox{☆}} +\newcommand\slicecont{\ensuremath{K_{☆}}} +\newcommand\notrunning{\mbox{•}} + +\newcommand\emptylist{\ensuremath{[\,]}} + +\usepackage{fontspec} + + \setmainfont{STIX Two}[ + UprightFont={* Math}, + ItalicFont={* Text Italic}, + BoldFont={* Text Bold}, + BoldItalicFont={* Text Bold Italic}, + ] + +\usepackage{keyval} + + +\makeatletter + +\newcommand\defcomponent[2]{% + \define@key{state}{#1}{\@nameuse{setmyval@#1}{##1}}% + \@namedef{setmyval@#1}##1{\@namedef{myval@#1}{##1}}% + \@nameuse{setmyval@#1}{\ensuremath{#2}}% +} + +\newcommand\component[1]{\@nameuse{myval@#1}} + +\makeatother + +\defcomponent{fuel}{F} +\defcomponent{running}{R} +\newcommand\norunning{\ensuremath{\mathord{\bullet}}} +\defcomponent{queue}{Q} +\defcomponent{heap}{H} +\defcomponent{jsheap}{J} +\defcomponent{mc}{M} +\defcomponent{slicemessage}{\slicemessage} +%\defcomponent{cont}{K} +\defcomponent{slicecont}{\slicecont} +\newcommand\running[1]{\ensuremath{\ast#1}} + +\newcommand\anystate[1][]{% + {\tracingall\setkeys{state}{#1}% + \let\c=\component + \ensuremath{\langle \c{fuel}, \c{running}, \c{queue}, \c{heap}; + \c{mc}, \c{jsheap} \rangle}}} +\newcommand\hstate[1][]{\anystate[running=\running{R},mc=K,#1]} +\newcommand\jstate[1][]{\anystate[fuel=0,running=\notrunning,mc=\running{M},#1]} + +\newcommand\goesto[1]{\xrightarrow{#1}} % {\stackrel{#1}{\longrightarrow}} + +\newcommand\cons[2]{\ensuremath{#1\mathbin:#2}} +\newcommand\parthreads[2]{\ensuremath{#1\mathbin\mid#2}} + +\begin{document} + +\title{Interaction of GHC Runtime with JavaScript Runtime} +\author{Norman Ramsey\\Cheng Shao} + +\maketitle + + + + +\section{Semantics of concurrency}\label{concurrency-sketch} + +To~describe how GHC's run-time system must interact with JavaScript, +we~present a small-step operational semantics using labeled +transitions. +The~semantics is in the same style as the +semantics in Simon PJ's tutorial \emph{Tackling the Awkward Squad}, +only not as thorough and careful. + +\subsection{Metavariables and state} + +The semantics describes transitions of a state machine whose +components model the states of \emph{both} Haskell and JavaScript runtimes. +\begin{itemize} +\item + $F$ (``fuel'') is the number of ticks a Haskell thread can + execute before returning control to JavaScript. This component is + always nonnegative, and it is + nonzero only when Haskell code is running. +\item + $R$ (``running'') is either the currently running Haskell + thread, or if no Haskell code is currently running, it is \notrunning\ (``nothing'') +\item + $Q$ (``run queue'') is a collection of runnable Haskell threads. + For the moment we model it using the parallel-composition + operator~$\mid$ from the Awkward Squad tutorial. + So a queue with $n$ runnable threads might have the form $R_1 \mid + R_2 \mid \cdots \mid R_n$. + An~empty queue is written like an empty list, thus:~\emptylist. +\item + $H$ (``heap'') is the Haskell heap, which may contain + \texttt{MVar}s and threads that are blocked on them. +\item +$M$ (``message'') is a JavaScript message. +Note that in JavaScript, \emph{message} is a term of~art. +The message is the atomic unit of concurrency. +JavaScript works by taking a message from its ``message queue,'' +running that message to completion (atomically), +then repeating. +While JavaScript is executing a message~$M$, that message will show up +as a component of the state machine. +\item +$K$ (``continuation'') is a JavaScript continuation. +This component represents the state of a JavaScript message that is +suspended waiting for Haskell code to return a value (most commonly a +\texttt{Promise}). +The~continuation is resumed when a value is supplied, so +$K\,v$ is a message. +\item +$J$ is the JavaScript heap, including the message queue, which we +don't model separately. +To~add a message~$M$ to the message queue, we write +$J+M$, which stands for the JavaScript heap that is like~$J$, except +$M$~is added to the message queue. +\item +\slicemessage\ is the special message that, when executed, causes JavaScript +to allocate a time slice to Haskell. +When GHC's runtime system wants to be sure it has a future opportunity +to run Haskell code, it~will add this message \slicemessage\ to +JavaScript's message queue. +\item +\slicecont\ is a continuation that JavaScript is suspended on when +it has granted a time slice to Haskell. +This~continuation is unique in that it doesn't expect a meaningful +return value. +It~is resumed by passing it the empty tuple, so $\slicecont()$ is a message. +\item +$v$ is a value (JavaScript value or Haskell WHNF). +\item +$e$ is an unevaluated Haskell expression, thunk, or closure (all words for the same +thing). +JavaScript calls into Haskell by constructing a closure, then asking +GHC's run-time system to evaluate~it. +\end{itemize} + +\noindent +A~complete machine state includes both Haskell components and +JavaScript components. +The two sets of components are separated by a semicolon. +Common forms of machine state include the following: +\begin{itemize} +\item +In state \hstate[mc=\slicecont], Haskell code is running in thread $R$ with fuel~$F$. +JavaScript is suspended on continuation~$\slicecont$ waiting for +Haskell to give back the~CPU. +\item +In state \hstate[mc=K], Haskell code is running in thread $R$ with fuel~$F$. +JavaScript called Haskell to get a \texttt{Promise}, and it is +suspended on continuation~$K$ waiting for that \texttt{Promise} to be delivered. +\item +In state \jstate, JavaScript has the~CPU and is running code +associated with message~$M$. +Haskell is suspended and not running. +Typically $F=0$, but the value of~$F$ is not relevant. +\end{itemize} +Components $R$, $Q$, $H$, and~$J$ are all meant to be stored in global +variables in their respective run-time systems. + +It~would be lovely to have a good invariant describing when the +message \slicemessage\ can appear in the JavaScript message queue, and +how many times. +I~think we like ``exactly when Haskell code is runnable'' and + ``at most once.'' + +\subsection{Most interesting state transitions} + + +The machine will enjoy a set of labeled transitions such as are +described in Simon PJ's paper on the ``Awkward Squad.'' Call these the +``standard transitions.'' (The awkward-squad machine state is a single +term, formed by the parallel composition of $R$ with all the +threads of~$Q$ and all the MVars of~$H$. The awkward squad +doesn't care about order, but we~do.) + +An~important nonstandard transition is made when the Haskell engine +runs out of fuel and returns control to JavaScript. +The~currently running thread is placed on the run queue. +\[ +\hstate[fuel=0,mc=\slicecont] +\goesto{\text{\texttt{return} from \texttt{rts\_grant\_slice}}} +\jstate[queue=\parthreads R Q,mc=\slicecont(),jsheap=J+\slicemessage] +\] +JavaScript starts running $\slicecont()$, that is, it resumes the +message that was suspended waiting for Haskell. +Notice that before returning, Haskell puts message \slicemessage\ on +JavaScript's message queue---thereby ensuring that at some future +point, it~will get more~CPU. +That's accomplished by a \texttt{setImmediate} call into the +JavaScript runtime, whose function will call \texttt{rts\_grant\_slice} +in the Haskell runtime. + +Haskell will also return control to JavaScript if there are no Haskell +threads available to run. +In~this case, it~doesn't ask for future access to the~CPU. +\[ +\hstate[running=\notrunning,queue={\emptylist},mc=\slicecont] +\goesto{\text{\texttt{return} from \texttt{rts\_grant\_slice}}} +\jstate[queue={\emptylist},mc=\slicecont()] +\] + +When JavaScript sees the special message~\slicemessage, it~grants CPU +to Haskell by calling \texttt{rts\_grant\_slice($F_0$)}, where +$F_0$~is the amount of fuel to grant. +(In~practice, this policy will likely be set elsewhere, not passed +from JavaScript.) +\[ +\jstate[queue=\parthreads R Q,mc=\slicemessage] +\goesto{\mathtt{rts\_grant\_slice}(F_0)} +\hstate[running=R,fuel=F_0,mc=\slicecont] +\] + +When JavaScript wants to call Haskell asynchronously, it first constructs a +closure~$e$, then calls \texttt{eval\_async($e$)}, which forks a new +Haskell thread. +\[ +\begin{array}{l@{}l} +\multicolumn2{l}{% + \jstate[mc=K(e)] + \goesto{\mathtt{eval\_async}(e)\text{ and \texttt{return} $p$}} + \jstate[mc=K(p),queue=\parthreads R Q,jsheap=J+\slicemessage] +} +\\[5pt] +\qquad \text{where}~ +& +p = \text{a fresh promise}\\ +& +R = \text{\texttt{deliver} $p$ (\texttt{try} $e$)}\\ +& +\text{\texttt{deliver p (Left exn)}} = \text{\texttt{p.fails exn}}\\ +& +\text{\texttt{deliver p (Right v)~}} = \text{\texttt{p.succeeds v}} +\\ +% +%\begin{minipage}{0.5\linewidth} +\end{array} +\] +This transition specifies the following sequence of actions: +\begin{enumerate} +\item +Allocate a fresh promise~$p$. +\item +Fork a new thread~$R$, which will evaluate~$e$. +(When the evaluation of~$e$ completes, $R$~will deliver the result to promise~$p$.) +\item +Ask JavaScript to grant future CPU (so that $R$ can be run). +\item +Return $p$ to JavaScript \emph{immediately}. +\end{enumerate} + + +When Haskell wants to call JavaScript asynchronously, it~expects a +\texttt{Promise} in return, and it suspends the current thread waiting +on the \texttt{Promise}. +The run-time system already handles this case as part of its support +for \texttt{MVar}s and exceptions; no special state transition is required. +To~demonstrate, +let's assume we define an exception constructor something like this: +\begin{verbatim} +data JSFault = JSFault JSValue +instance Exception JSFault where ... +\end{verbatim} +The Haskell equivalent of a call \monobox{f e} to a foreign JavaScript +function with argument~\texttt{e} can look something like this: +\begin{verbatim} + do let v = haskellToJS e -- evaluates e, converts result to JavaScript + p <- g v -- call to JavaScript returns a `Promise` + m <- newEmptyMVar + myself <- threadId + p.then(putMVar m, throwTo myself . JSFault) -- attach continuations to p + takeMVar m -- block waiting for async call to complete +\end{verbatim} +To~make this code work, the two continuations that contain references +to~\texttt{m} and \texttt{myself} will have to be wrapped in +JavaScript lambdas, something like this: +\begin{verbatim} + (v => rts_eval_async(⟦putVar m⟧)) +\end{verbatim} +where \texttt{⟦putVar m⟧} denotes the run-time representation of the +Haskell closure in JavaScript. + +Notes: Calls \monobox{g v} and \monobox{p.then(...)} are both +\emph{synchronous}. + + + +\subsection{Other state transitions} + +Computation is modeled by evolving the running thread from +$R$~to~$R'$. +Computation may also change the Haskell heap and the queue of runnable +threads, as described in the Awkward Squad paper. +Computation uses fuel, which is always +nonnegative. +\[ +\hstate[fuel=F+1] +\goesto{\text{computation}} +\hstate[running=\running{R'},queue=Q',heap=H'] +\] + +The Haskell scheduler can decide to grant the CPU to a different +Haskell thread. +\[ +\hstate[fuel=F,queue=\parthreads{R'} Q] +\goesto{\text{context switch}} +\hstate[running=\running{R'},queue=\parthreads{Q}{R}] +\] + +\subsection{Scaling up to POSIX} + +What if a user types Control-C? + +Unix: +\begin{itemize} +\item +Computation doesn't use fuel. +\item +Add component to state machine: \emph{why} does $F=0$? +Maybe because: a~user hit Control-C. +\end{itemize} + + + +\end{document} +