WIP

2024-08-16 04:00:24 +03:00 · 2022-01-20 13:20:31 +00:00 · 2022-01-20 13:20:31 +00:00 · 0a21d34ab7
commit 0a21d34ab7
parent a331d16633
3 changed files with 355 additions and 114 deletions
--- a/docs/default.nix
+++ b/docs/default.nix
@ -1,4 +1,4 @@
-{ haskell-nix, mdbook, stdenvNoCC }:
+{ haskell-nix, mdbook, stdenvNoCC, stix-two, texlive }:
 stdenvNoCC.mkDerivation {
  name = "asterius-docs";
  src = haskell-nix.haskellLib.cleanGit {
@ -6,8 +6,12 @@ stdenvNoCC.mkDerivation {
    src = ../.;
    subDir = "docs";
  };
-  nativeBuildInputs = [ mdbook ];
+  nativeBuildInputs = [ mdbook stix-two texlive.combined.scheme-full ];
  buildPhase = ''
+    pushd $(mktemp -d)
+    latexmk -pdfxe -file-line-error -halt-on-error $OLDPWD/src/semantics.tex
+    mv semantics.pdf $OLDPWD/src
+    popd
    mdbook build --dest-dir $out
  '';
  dontInstall = true;
--- a/docs/src/semantics.md
+++ b/docs/src/semantics.md
@ -1,113 +1,3 @@
-# Draft semantics of concurrency and foreign calls.
+# Interaction of GHC Runtime with JavaScript Runtime

-Note: This document assumes that every function takes exactly one
-argument. Just imagine that it's the last argument in a fully
-saturated call.
-
-## Foreign export asynchronous
-
-Suppose that a Haskell function `f` is exported to JavaScript
-asynchronously (which might be the default). When JavaScript calls the
-exported function with argument `v`, it has the effect of performing
-the IO action `⟦f⟧ v`, where the translation `⟦f⟧` is defined as
-follows:
-
-    ⟦f⟧ v = do
-       p <- allocate new promise
-       let run_f = case try (return $ f $ jsToHaskell v) of
-                     Left exn -> p.fails (exnToJS exn)
-                     Right a  -> p.succeeds (haskellToJS a)
-       forkIO run_f
-       return p -- returned to JavaScript
-
-Not specified here is whether the scheduler is allowed to steal a few
-cycles to run previously forked threads.
-
-N.B. This is just a semantics. We certainly have the option of
-implementing the entire action completely in the runtime system.
-
-Not yet specified: What is the API by which JavaScript would call an
-asynchronously exported Haskell function? Would it, for example, use
-API functions to construct a Haskell closure, then evaluate it?
-
-## Foreign import asynchronous
-
-Suppose that a JavaScript  function `g` is imported asynchronously
-(which might be the default). Let types `a` and `b` stand for two
-unknown but fixed types. The JavaScript function expects an argument
-of type `a` and returns a `Promise` that (if successful) eventually
-delivers a value of type `b`. When a Haskell thunk of the form `g e`
-is forced (evaluated), the machine performs the following monadic
-action, the result of which is (eventually) written into the thunk.
-
-    do let v = haskellToJS e  -- evaluates e, converts result to JavaScript
-      p <- g v               -- call returns a `Promise`, "immediately"
-      m <- newEmptyMVar
-      ... juju to associate m with p ... -- RTS primitive?
-      result <- takeMVar m
-      case result of Left fails -> ... raise asynchronous exception ...
-                      Right b -> return $ jsToHaskell v
-
-## CPU sharing
-
-Suppose GHC wishes to say politely to the JavaScript engine, "every so
-often I would like to use the CPU for a bounded time."  It looks like
-Haskell would need to add a message to the JavaScript message queue,
-such that the function associated with that messages is "run Haskell
-for N ticks."  Is the right API to call `setTimeout` with a delay of 0
-seconds?
-
-## Concurrency sketch
-
-Let's suppose the state of a Haskell machine has these components:
-
-  - `F` ("fuel") is the number of ticks a Haskell thread can execute
-    before returning control to JavaScript. This component is present
-    only when Haskell code is running.
-
-  - `R` ("running") is either the currently running Haskell thread, or
-    if no thread is currently running, it is • ("nothing")
-
-  - `Q` ("run queue") is a collection of runnable threads.
-
-  - `H` ("heap") is the Haskell heap, which may contain `MVar`s and
-    threads that are blocked on them.
-
-Components `R` and `H` are used linearly, so they can be stored in
-global mutable state.
-
-The machine will enjoy a set of labeled transitions such as are
-described in Simon PJ's paper on the "Awkward Squad."  Call these the
-"standard transitions."  (The awkward-squad machine state is a single
-term, formed by the parallel composition of `R` with all the threads
-of `Q` and all the MVars of `H`. The awkward squad doesn't care about
-order, but we do.)  To specify the standard transitions, we could add
-an additional clock that tells the machine when to switch the running
-thread `R` out for a new thread from the queue. Or we could leave the
-context switch nondeterministic, as it is in the awkward-squad paper.
-Whatever seems useful.
-
-Every state transition has the potential use to fuel. Fuel might
-actually be implemented using an allocation clock, but for semantics
-purposes, we can simply decrement fuel at each state transition, then
-gate the standard transitions on the condition `F > 0`.
-
-At a high level, every invocation of Haskell looks the same:
-JavaScript starts the Haskell machine in a state `⟨F, •, Q, H⟩`, and
-the Haskell machine makes repeated state transitions until it reaches
-one of two stopping states:
-
-  - `⟨F',•, [], H'⟩`: no Haskell threads are left to run
-
-  - `⟨0, ̧R', Q', H'⟩`: fuel is exhausted, in which case the machine
-    moves the currently running thread onto the run queue, reaching
-    state `⟨0, ̧•, R':Q', H'⟩`
-
-Once one of these states is reached, GHC's runtime system takes two
-actions:
-
- 1. It allocates a polite request for the CPU and puts that request on
-    the JavaScript message queue, probably using `setTimeout` with a
-    delay of 0 seconds.
-
- 2. It returns control to JavaScript.
+See rendered [pdf](semantics.pdf) for details.
--- a/docs/src/semantics.tex
+++ b/docs/src/semantics.tex
@ -0,0 +1,347 @@
+\documentclass{article}
+\usepackage[top=0.2in,bottom=0.3in]{geometry}
+\usepackage{amsmath}
+
+\makeatletter
+\newcommand{\mono}[1]{%  Thanks Paul A. (Windfall Software)
+  {\@tempdima = \fontdimen2\font
+   \frenchspacing
+   \texttt{\spaceskip = 1.1\@tempdima{}#1}}}
+\newcommand\xmono[2][1.1]
+  {\@tempdima = \fontdimen2\font
+   \frenchspacing
+   \texttt{\spaceskip = #1\@tempdima{}#2}}
+\makeatother
+\newcommand*\monobox[2][1.1]{\mbox{\upshape\xmono[#1]{#2}}}
+
+
+% ★
+\newcommand\slicemessage{\mbox{☆}}
+\newcommand\slicecont{\ensuremath{K_{☆}}}
+\newcommand\notrunning{\mbox{•}}
+
+\newcommand\emptylist{\ensuremath{[\,]}}
+
+\usepackage{fontspec}
+
+  \setmainfont{STIX Two}[
+                  UprightFont={* Math},
+                  ItalicFont={* Text Italic},
+                  BoldFont={* Text Bold},
+                  BoldItalicFont={* Text Bold Italic},
+                  ]
+
+\usepackage{keyval}
+
+
+\makeatletter
+
+\newcommand\defcomponent[2]{%
+  \define@key{state}{#1}{\@nameuse{setmyval@#1}{##1}}%
+  \@namedef{setmyval@#1}##1{\@namedef{myval@#1}{##1}}%
+  \@nameuse{setmyval@#1}{\ensuremath{#2}}%
+}
+
+\newcommand\component[1]{\@nameuse{myval@#1}}
+
+\makeatother
+
+\defcomponent{fuel}{F}
+\defcomponent{running}{R}
+\newcommand\norunning{\ensuremath{\mathord{\bullet}}}
+\defcomponent{queue}{Q}
+\defcomponent{heap}{H}
+\defcomponent{jsheap}{J}
+\defcomponent{mc}{M}
+\defcomponent{slicemessage}{\slicemessage}
+%\defcomponent{cont}{K}
+\defcomponent{slicecont}{\slicecont}
+\newcommand\running[1]{\ensuremath{\ast#1}}
+
+\newcommand\anystate[1][]{%
+  {\tracingall\setkeys{state}{#1}%
+   \let\c=\component
+   \ensuremath{\langle \c{fuel}, \c{running}, \c{queue}, \c{heap}; 
+               \c{mc}, \c{jsheap} \rangle}}}
+\newcommand\hstate[1][]{\anystate[running=\running{R},mc=K,#1]}
+\newcommand\jstate[1][]{\anystate[fuel=0,running=\notrunning,mc=\running{M},#1]}
+
+\newcommand\goesto[1]{\xrightarrow{#1}} % {\stackrel{#1}{\longrightarrow}}
+
+\newcommand\cons[2]{\ensuremath{#1\mathbin:#2}}
+\newcommand\parthreads[2]{\ensuremath{#1\mathbin\mid#2}}
+
+\begin{document}
+
+\title{Interaction of GHC Runtime with JavaScript Runtime}
+\author{Norman Ramsey\\Cheng Shao}
+
+\maketitle
+
+
+
+
+\section{Semantics of concurrency}\label{concurrency-sketch}
+
+To~describe how GHC's run-time system must interact with JavaScript,
+we~present a small-step operational semantics using labeled
+transitions.
+The~semantics is in the same style as the
+semantics in Simon PJ's tutorial \emph{Tackling the Awkward Squad},
+only not as thorough and careful.
+
+\subsection{Metavariables and state}
+
+The semantics describes transitions of a state machine whose
+components model the states of \emph{both} Haskell and JavaScript runtimes.
+\begin{itemize}
+\item
+  $F$ (``fuel'') is the number of ticks a Haskell thread can
+  execute before returning control to JavaScript. This component is
+  always nonnegative, and it is
+  nonzero only when Haskell code is running.
+\item
+  $R$ (``running'') is either the currently running Haskell
+  thread, or if no Haskell code is currently running, it is \notrunning\ (``nothing'')
+\item
+  $Q$ (``run queue'') is a collection of runnable Haskell threads.
+  For the moment we model it using the parallel-composition
+  operator~$\mid$ from the Awkward Squad tutorial.
+  So a queue with $n$ runnable threads might have the form $R_1 \mid
+  R_2 \mid \cdots \mid R_n$.
+  An~empty queue is written like an empty list, thus:~\emptylist.
+\item
+  $H$ (``heap'') is the Haskell heap, which may contain
+  \texttt{MVar}s and threads that are blocked on them.
+\item
+$M$ (``message'') is a JavaScript message.
+Note that in JavaScript,  \emph{message} is a term of~art.
+The message is the atomic unit of concurrency.
+JavaScript works by taking a message from its ``message queue,''
+running that message to completion (atomically),
+then repeating.
+While JavaScript is executing a message~$M$, that message will show up
+as a component of the state machine.
+\item
+$K$ (``continuation'') is a JavaScript continuation.
+This component represents the state of a JavaScript message that is
+suspended waiting for Haskell code to return a value (most commonly a
+\texttt{Promise}). 
+The~continuation is resumed when a value is supplied, so
+$K\,v$ is a message.
+\item
+$J$ is the JavaScript heap, including the message queue, which we
+don't model separately.
+To~add a message~$M$ to the message queue, we write
+$J+M$, which stands for the JavaScript heap that is like~$J$, except
+$M$~is added to the message queue.
+\item
+\slicemessage\ is the special message that, when executed, causes JavaScript
+to allocate a time slice to Haskell.
+When GHC's runtime system wants to be sure it has a future opportunity
+to run Haskell code, it~will add this message \slicemessage\ to
+JavaScript's message queue.
+\item
+\slicecont\ is a continuation that JavaScript is suspended on when
+it has granted a time slice to Haskell.
+This~continuation is unique in that it doesn't expect a meaningful
+return value. 
+It~is resumed by passing it the empty tuple, so $\slicecont()$ is a message.
+\item
+$v$ is a value (JavaScript value or Haskell WHNF).
+\item
+$e$ is an unevaluated Haskell expression, thunk, or closure (all words for the same
+thing).
+JavaScript calls into Haskell by constructing a closure, then asking
+GHC's run-time system to evaluate~it.
+\end{itemize}
+
+\noindent
+A~complete machine state includes both Haskell components and
+JavaScript components.
+The two sets of components are separated by a semicolon.
+Common forms of machine state include the following:
+\begin{itemize}
+\item
+In state \hstate[mc=\slicecont], Haskell code is running in thread $R$ with fuel~$F$.
+JavaScript is suspended on continuation~$\slicecont$ waiting for
+Haskell to give back the~CPU.
+\item
+In state \hstate[mc=K], Haskell code is running in thread $R$ with fuel~$F$.
+JavaScript called Haskell to get a \texttt{Promise}, and it is
+suspended on continuation~$K$ waiting for that \texttt{Promise} to be delivered.
+\item
+In state \jstate, JavaScript has the~CPU and is running code
+associated with message~$M$.
+Haskell is suspended and not running.
+Typically $F=0$, but the value of~$F$ is not relevant.
+\end{itemize}
+Components $R$, $Q$, $H$, and~$J$ are all meant to be stored in global
+variables in their respective run-time systems.
+
+It~would be lovely to have a good invariant describing when the
+message \slicemessage\ can appear in the JavaScript message queue, and
+how many times.
+I~think we like ``exactly when Haskell code is runnable'' and
+ ``at most once.''
+
+\subsection{Most interesting state transitions}
+
+
+The machine will enjoy a set of labeled transitions such as are
+described in Simon PJ's paper on the ``Awkward Squad.'' Call these the
+``standard transitions.'' (The awkward-squad machine state is a single
+term, formed by the parallel composition of $R$ with all the
+threads of~$Q$ and all the MVars of~$H$. The awkward squad
+doesn't care about order, but we~do.)
+
+An~important nonstandard transition is made when the Haskell engine
+runs out of fuel and returns control to JavaScript. 
+The~currently running thread is placed on the run queue.
+\[
+\hstate[fuel=0,mc=\slicecont] 
+\goesto{\text{\texttt{return} from \texttt{rts\_grant\_slice}}}
+\jstate[queue=\parthreads R Q,mc=\slicecont(),jsheap=J+\slicemessage]
+\]
+JavaScript starts running $\slicecont()$, that is, it resumes the
+message that was suspended waiting for Haskell.
+Notice that before returning, Haskell puts message \slicemessage\ on
+JavaScript's message queue---thereby ensuring that at some future
+point, it~will get more~CPU.
+That's accomplished by a \texttt{setImmediate} call into the
+JavaScript runtime, whose function will call \texttt{rts\_grant\_slice}
+in the Haskell runtime.
+
+Haskell will also return control to JavaScript if there are no Haskell
+threads available to run.
+In~this case, it~doesn't ask for future access to the~CPU.
+\[
+\hstate[running=\notrunning,queue={\emptylist},mc=\slicecont] 
+\goesto{\text{\texttt{return} from \texttt{rts\_grant\_slice}}}
+\jstate[queue={\emptylist},mc=\slicecont()]
+\]
+
+When JavaScript sees the special message~\slicemessage, it~grants CPU
+to Haskell by calling \texttt{rts\_grant\_slice($F_0$)}, where
+$F_0$~is the amount of fuel to grant.
+(In~practice, this policy will likely be set elsewhere, not passed
+from JavaScript.)
+\[
+\jstate[queue=\parthreads R Q,mc=\slicemessage]
+\goesto{\mathtt{rts\_grant\_slice}(F_0)}
+\hstate[running=R,fuel=F_0,mc=\slicecont] 
+\]
+
+When JavaScript wants to call Haskell asynchronously, it first constructs a
+closure~$e$, then calls \texttt{eval\_async($e$)}, which forks a new
+Haskell thread.
+\[
+\begin{array}{l@{}l}
+\multicolumn2{l}{%
+  \jstate[mc=K(e)]
+  \goesto{\mathtt{eval\_async}(e)\text{ and \texttt{return} $p$}}
+  \jstate[mc=K(p),queue=\parthreads R Q,jsheap=J+\slicemessage]
+}
+\\[5pt]
+\qquad \text{where}~
+&
+p = \text{a fresh promise}\\
+&
+R = \text{\texttt{deliver} $p$ (\texttt{try} $e$)}\\
+&
+\text{\texttt{deliver p (Left exn)}} = \text{\texttt{p.fails exn}}\\
+&
+\text{\texttt{deliver p (Right v)~}} = \text{\texttt{p.succeeds v}}
+\\
+%
+%\begin{minipage}{0.5\linewidth}
+\end{array}
+\]
+This transition specifies the following sequence of actions:
+\begin{enumerate}
+\item
+Allocate a fresh promise~$p$.
+\item
+Fork a new thread~$R$, which will evaluate~$e$.
+(When the evaluation of~$e$ completes, $R$~will deliver the result to promise~$p$.)
+\item
+Ask JavaScript to grant future CPU (so that $R$ can be run).
+\item
+Return $p$ to JavaScript \emph{immediately}.
+\end{enumerate}
+
+
+When Haskell wants to call JavaScript asynchronously, it~expects a
+\texttt{Promise} in return, and it suspends the current thread waiting
+on the \texttt{Promise}.  
+The run-time system already handles this case as part of its support
+for \texttt{MVar}s and exceptions; no special state transition is required.
+To~demonstrate,
+let's assume we define an exception constructor something like this:
+\begin{verbatim}
+data JSFault = JSFault JSValue
+instance Exception JSFault where ...
+\end{verbatim}
+The Haskell equivalent of a call \monobox{f e} to a foreign JavaScript
+function with argument~\texttt{e} can look something like this:
+\begin{verbatim}
+  do let v = haskellToJS e  -- evaluates e, converts result to JavaScript
+     p <- g v               -- call to JavaScript returns a `Promise`
+     m <- newEmptyMVar
+     myself <- threadId
+     p.then(putMVar m, throwTo myself . JSFault) -- attach continuations to p
+     takeMVar m             -- block waiting for async call to complete
+\end{verbatim}
+To~make this code work, the two continuations that contain references
+to~\texttt{m} and \texttt{myself} will have to be wrapped in
+JavaScript lambdas, something like this:
+\begin{verbatim}
+  (v => rts_eval_async(⟦putVar m⟧))
+\end{verbatim}
+where \texttt{⟦putVar m⟧} denotes the run-time representation of the
+Haskell closure in JavaScript. 
+
+Notes: Calls \monobox{g v} and \monobox{p.then(...)} are both
+\emph{synchronous}.
+
+
+
+\subsection{Other state transitions}
+
+Computation is modeled by evolving the running thread from
+$R$~to~$R'$.
+Computation may also change the Haskell heap and the queue of runnable
+threads, as described in the Awkward Squad paper.
+Computation uses fuel, which is always
+nonnegative.
+\[
+\hstate[fuel=F+1]
+\goesto{\text{computation}}
+\hstate[running=\running{R'},queue=Q',heap=H']
+\]
+
+The Haskell scheduler can decide to grant the CPU to a different
+Haskell thread.
+\[
+\hstate[fuel=F,queue=\parthreads{R'} Q]
+\goesto{\text{context switch}}
+\hstate[running=\running{R'},queue=\parthreads{Q}{R}]
+\]
+
+\subsection{Scaling up to POSIX}
+
+What if a user types Control-C?
+
+Unix:
+\begin{itemize}
+\item
+Computation doesn't use fuel.
+\item
+Add component to state machine: \emph{why} does $F=0$?
+Maybe because: a~user hit Control-C.
+\end{itemize}
+
+
+
+\end{document}
+