From bd193689675cc5630d5cf514afa281d0335c3e12 Mon Sep 17 00:00:00 2001 From: Shao Cheng Date: Thu, 5 Apr 2018 19:47:03 +0800 Subject: [PATCH] Write a wasm commentary in docs --- docs/architecture.md | 2 -- docs/building.md | 2 -- docs/index.md | 2 -- docs/roadmap.md | 4 +--- docs/webassembly.md | 36 ++++++++++++++++++++++++++++++++++++ mkdocs.yml | 11 +++++++---- 6 files changed, 44 insertions(+), 13 deletions(-) create mode 100644 docs/webassembly.md diff --git a/docs/architecture.md b/docs/architecture.md index 99b829c3..56b06a8e 100644 --- a/docs/architecture.md +++ b/docs/architecture.md @@ -1,5 +1,3 @@ -# Project architecture - ## High-level architecture The `asterius` project is hosted at [GitHub](https://github.com/tweag/asterius). The monorepo contains several packages: diff --git a/docs/building.md b/docs/building.md index fa72021a..963f681a 100644 --- a/docs/building.md +++ b/docs/building.md @@ -1,5 +1,3 @@ -# Building guide - `asterius` is tested on Linux x64 and Windows x64. macOS x64 may also work. tl;dr: See [`.circleci/config.yml`](https://github.com/tweag/asterius/blob/master/.circleci/config.yml) for CircleCI config, [`appveyor.yml`](https://github.com/tweag/asterius/blob/master/appveyor.yml) for AppVeyor config. diff --git a/docs/index.md b/docs/index.md index abf14584..469376c4 100644 --- a/docs/index.md +++ b/docs/index.md @@ -1,5 +1,3 @@ -# asterius: A Haskell to WebAssembly compiler - [![CircleCI](https://circleci.com/gh/tweag/asterius/tree/master.svg?style=shield)](https://circleci.com/gh/tweag/asterius/tree/master) [![AppVeyor](https://ci.appveyor.com/api/projects/status/github/tweag/asterius?branch=master&svg=true)](https://ci.appveyor.com/project/TerrorJack/asterius?branch=master) diff --git a/docs/roadmap.md b/docs/roadmap.md index f90419db..1169f978 100644 --- a/docs/roadmap.md +++ b/docs/roadmap.md @@ -1,6 +1,4 @@ -# Project roadmap - -This document maintains a list of milestones along with their planned features. Some notations: +This page maintains a list of milestones along with their planned features. Some notations: * M0, M1, .. indicates Milestone 0, Milestone 1, etc. The numbers grow monotonically. * P0, P1, .. incicates Priority 0, Priority 1, etc. The lesser the number, the more significant is the feature. diff --git a/docs/webassembly.md b/docs/webassembly.md new file mode 100644 index 00000000..f47b1978 --- /dev/null +++ b/docs/webassembly.md @@ -0,0 +1,36 @@ +## WebAssembly as a Haskell compilation target + +There are a few issues to address when compiling Cmm to WebAssembly. + +### Implementing Haskell Stack/Heap + +The Haskell runtime maintains a TSO(Thread State Object) for each Haskell thread, and each TSO contains a separate stack for the STG machine. The WebAssembly platform has its own "stack" concept though; the execution of WebAssembly is based on a stack machine model, where instructions consume operands on the stack and push new values onto it. + +We use the linear memory to simulate Haskell stack/heap. Popping/pushing the Haskell stack only involves loading/storing on the linear memory. Heap allocation only involves bumping the heap pointer. Running out of space will trigger a WebAssembly trap, instead of doing GC. + +All discussions in the documentation use the term "stack" for the Haskell stack, unless explicitly stated otherwise. + +### Implementing STG machine registers + +The Haskell runtime makes use of "virtual registers" like Sp, Hp or R1 to implement the STG machine. The NCG tries to map some of the virtual registers to real registers when generating assembly code. However, WebAssembly doesn't have language constructs that map to real registers, so all virtual registers are implemented as local variables of the interpreter function. + +### Handling control flow + +WebAssembly currently enforces structured control flow, which prohibits arbitrary branching. Also, explicit tail calls are missing. + +The Cmm control flow mainly involves two forms of branching: in-function or cross-function. Each function consists of a map from `hoopl` labels to basic blocks and an entry label. Branching happens at the end of each basic block. + +In-function branching is relatively easier to handle. `binaryen` provides a "relooper" which can recover WebAssembly instructions with structured control flow from a control-flow graph. For each `CmmGraph` we invoke the relooper to handle branching between basic blocks. + +Cross-function branching (`CmmCall`) is tricky. WebAssembly lacks explicit tail calls, and the relooper can't be easily used in this case since there's a computed goto, and potential targets include all Cmm blocks involved in linking. There are multiple possible ways to handle this situation: + +* Collect all Cmm blocks into one function, additionally add a "dispatcher" block. All `CmmCall`s save the callee to a register and branch to the "dispatcher" block, and the "dispatcher" uses `br_table` or a binary decision tree to branch to the entry block of callee. +* One WebAssembly function for one `CmmProc`, and upon `CmmCall` the function returns the function id of callee. A mini-interpreter function at the top level repeatedly invoke the functions using `call_indirect`. This approach is actually used by the unregisterised mode of `ghc`. + +We're still investigating the best way. The first approach probably produces the fastest code, at the cost of no dynamic linking (not a scheduled feature anyway) and potential slowdown when linking large Haskell programs (unless an O(n) relooping algorithm is implemented). + +### Relocations + +When producing a WebAssembly binary, we need to map `CLabel`s to the precise linear memory locations for `CmmStatics` or the precise table ids for `CmmProc`s. They are unknown when compiling individual modules, so `binaryen` is invoked only when linking, and during compiling we only convert `CLabel`s to some serializable representation. + +It's also worth noting that currently only `wasm32` is implemented, but we are running 64-bit `ghc`, so extra care need to be taken when computing memory locations. diff --git a/mkdocs.yml b/mkdocs.yml index a7b80962..a0fda3b3 100644 --- a/mkdocs.yml +++ b/mkdocs.yml @@ -2,8 +2,11 @@ site_name: Asterius site_url: https://tweag.github.io/asterius repo_url: https://github.com/tweag/asterius pages: - - 'Introduction': 'index.md' - - 'Building guide': 'building.md' - - 'Project architecture': 'architecture.md' - - 'Project roadmap': 'roadmap.md' + - 'Home': 'index.md' + - For users: + - 'Building guide': 'building.md' + - For developers: + - 'Project architecture': 'architecture.md' + - 'About WebAssembly': 'webassembly.md' + - 'Project roadmap': 'roadmap.md' theme: readthedocs