This commit updates the `walrus` crate used in `wasm-bindgen`. The major
change here is how `walrus` handles element segments, exposing segments
rather than trying to keep a contiugous array of all the elements and
doing the splitting itself. That means that we need to do mroe logic
here in `wasm-bindgen` to juggle indices, segments, etc.
This commit adds a test suite for consuming interface types modules as
input and producing a JS polyfill output. The tests are relatively
simple today and don't exercise a ton of functionality, but they should
hopefully cover the breadth of at least some basics of what wasm
interface types supports today.
A few small fixes were applied along the way, such as:
* Don't require modules to have a stack pointer
* Allow passing `*.wat`, `*.wit`, or `*.wasm` files as input to
`wasm-bindgen` instead of always requiring `*.wasm`.
This commit is a pretty large scale rewrite of the internals of wasm-bindgen. No user-facing changes are expected as a result of this PR, but due to the scale of changes here it's likely inevitable that at least something will break. I'm hoping to get more testing in though before landing!
The purpose of this PR is to update wasm-bindgen to the current state of the interface types proposal. The wasm-bindgen tool was last updated when it was still called "WebIDL bindings" so it's been awhile! All support is now based on https://github.com/bytecodealliance/wasm-interface-types which defines parsers/binary format/writers/etc for wasm-interface types.
This is a pretty massive PR and unfortunately can't really be split up any more afaik. I don't really expect realistic review of all the code here (or commits), but some high-level changes are:
* Interface types now consists of a set of "adapter functions". The IR in wasm-bindgen is modeled the same way not.
* Each adapter function has a list of instructions, and these instructions work at a higher level than wasm itself, for example with strings.
* The wasm-bindgen tool has a suite of instructions which are specific to it and not present in the standard. (like before with webidl bindings)
* The anyref/multi-value transformations are now greatly simplified. They're simply "optimization passes" over adapter functions, removing instructions that are otherwise present. This way we don't have to juggle so much all over the place, and instructions always have the same meaning.
This commit switches all of `wasm-bindgen` from the `failure` crate to
`anyhow`. The `anyhow` crate should serve all the purposes that we
previously used `failure` for but has a few advantages:
* It's based on the standard `Error` trait rather than a custom `Fail`
trait, improving ecosystem compatibility.
* We don't need a `#[derive(Fail)]`, which means that's less code to
compile for `wasm-bindgen`. This notably helps the compile time of
`web-sys` itself.
* Using `Result<()>` in `fn main` with `anyhow::Error` produces
human-readable output, so we can use that natively.
This tiny crate provides utilities for working with Wasm codegen
conventions (typically established by LLVM or lld) such as getting the shadow
stack pointer.
It also de-duplicates all the places in the codebase where we were implementing
these conventions in one-off ways.
This crate provides a transformation to turn exported functions that use a
return pointer into exported functions that use multi-value.
Consider the following function:
```rust
pub extern "C" fn pair(a: u32, b: u32) -> [u32; 2] {
[a, b]
}
```
LLVM will by default compile this down into the following Wasm:
```wasm
(func $pair (param i32 i32 i32)
local.get 0
local.get 2
i32.store offset=4
local.get 0
local.get 1
i32.store)
```
What's happening here is that the function is not directly returning the
pair at all, but instead the first `i32` parameter is a pointer to some
scratch space, and the return value is written into the scratch space. LLVM
does this because it doesn't yet have support for multi-value Wasm, and so
it only knows how to return a single value at a time.
Ideally, with multi-value, what we would like instead is this:
```wasm
(func $pair (param i32 i32) (result i32 i32)
local.get 0
local.get 1)
```
However, that's not what this transformation does at the moment. This
transformation is a little simpler than mutating existing functions to
produce a multi-value result, instead it introduces new functions that wrap
the original function and translate the return pointer to multi-value
results in this wrapper function.
With our running example, we end up with this:
```wasm
;; The original function.
(func $pair (param i32 i32 i32)
local.get 0
local.get 2
i32.store offset=4
local.get 0
local.get 1
i32.store)
(func $pairWrapper (param i32 i32) (result i32 i32)
;; Our return pointer that points to the scratch space we are allocating
;; on the shadow stack for calling `$pair`.
(local i32)
;; Allocate space on the shadow stack for the result.
global.get $shadowStackPointer
i32.const 8
i32.sub
local.tee 2
global.set $shadowStackPointer
;; Call `$pair` with our allocated shadow stack space for its results.
local.get 2
local.get 0
local.get 1
call $pair
;; Copy the return values from the shadow stack to the wasm stack.
local.get 2
i32.load
local.get 2 offset=4
i32.load
;; Finally, restore the shadow stack pointer.
local.get 2
i32.const 8
i32.add
global.set $shadowStackPointer)
```
This `$pairWrapper` function is what we actually end up exporting instead of
`$pair`.
This commit updates `wasm-bindgen` to the latest version of `walrus`
which transforms all internal IR representations to a list-based IR
instead of a tree-based IR. This isn't a major change other than
cosmetic for `wasm-bindgen` itself, but involves a lot of changes to the
threads/anyref passes.
This commit also updates our CI configuration to actually run all the
anyref tests on CI. This is done by downloading a nightly build of
node.js which is theorized to continue to be there for awhile until the
full support makes its way into releases.
Ensure that we enable the new `parallel` feature in the CLI so our tools all use
parallelized parsing, but none of our specific crates need it for usage.
This commit updates the `walrus` dependency with recent upstream API
changes in `walrus` itself, namely updates to passive segements and how
memory data segments are handled
This commit is the second, and hopefully last massive, refactor for
using WebIDL bindings internally in `wasm-bindgen`. This commit actually
fully executes on the task at hand, moving `wasm-bindgen` to internally
using WebIDL bindings throughout its code generation, anyref passes,
etc. This actually fixes a number of issues that have existed in the
anyref pass for some time now!
The main changes here are to basically remove the usage of `Descriptor`
from generating JS bindings. Instead two new types are introduced:
`NonstandardIncoming` and `NonstandardOutgoing` which are bindings lists
used for incoming/outgoing bindings. These mirror the standard
terminology and literally have variants which are the standard values.
All `Descriptor` types are now mapped into lists of incoming/outgoing
bindings and used for process in wasm-bindgen. All JS generation has
been refactored and updated to now process these lists of bindings
instead of the previous `Descriptor`.
In other words this commit takes `js2rust.rs` and `rust2js.rs` and first
splits them in two. Interpretation of `Descriptor` and what to do for
conversions is in the binding selection modules. The actual generation
of JS from the binding selection is now performed by `incoming.rs` and
`outgoing.rs`. To boot this also deduplicates all the code between the
argument handling of `js2rust.rs` and return value handling of
`rust2js.rs`. This means that to implement a new binding you only need
to implement it one place and it's implemented for free in the other!
This commit is not the end of the story though. I would like to add a
mdoe to `wasm-bindgen` that literally emits a WebIDL bindings section.
That's left for a third (and hopefully final) refactoring which is also
intended to optimize generated JS for bindings.
This commit currently loses the optimization where an imported is hooked
up by value directly whenever a shim isn't needed. It's planned that
the next refactoring to emit a webidl binding section that can be added
back in. It shouldn't be too too hard hopefully since all the
scaffolding is in place now.
cc #1524
This commit implements [RFC 8], which enables transitive and transparent
dependencies on NPM. The `module` attribute, when seen and not part of a
local JS snippet, triggers detection of a `package.json` next to
`Cargo.toml`. If found it will cause the `wasm-bindgen` CLI tool to load
and parse the `package.json` within each crate and then create a merged
`package.json` at the end.
[RFC 8]: https://github.com/rustwasm/rfcs/pull/8
This commit adds experimental support to `wasm-bindgen` to emit and
leverage the `anyref` native wasm type. This native type is still in a
proposal status (the reference-types proposal). The intention of
`anyref` is to be able to directly hold JS values in wasm and pass the
to imported functions, namely to empower eventual host bindings (now
renamed WebIDL bindings) integration where we can skip JS shims
altogether for many imports.
This commit doesn't actually affect wasm-bindgen's behavior at all
as-is, but rather this support requires an opt-in env var to be
configured. Once the support is stable in browsers it's intended that
this will add a CLI switch for turning on this support, eventually
defaulting it to `true` in the far future.
The basic strategy here is to take the `stack` and `slab` globals in the
generated JS glue and move them into wasm using a table. This new table
in wasm is managed at the fringes via injected shims. At
`wasm-bindgen`-time the CLI will rewrite exports and imports with shims
that actually use `anyref` if needed, performing loads/stores inside the
wasm module instead of externally in the wasm module.
This should provide a boost over what we have today, but it's not a
fantastic strategy long term. We have a more grand vision for `anyref`
being a first-class type in the language, but that's on a much longer
horizon and this is currently thought to be the best we can do in terms
of integration in the near future.
The stack/heap JS tables are combined into one wasm table. The stack
starts at the end of the table and grows down with a stack pointer (also
injected). The heap starts at the end and grows up (state managed in
linear memory). The anyref transformation here will hook up various
intrinsics in wasm-bindgen to the runtime functionality if the anyref
supoprt is enabled.
The main tricky treatment here was applied to closures, where we need JS
to use a different function pointer than the one Rust gives it to use a
JS function pointer empowered with anyref. This works by switching up a
bit how descriptors work, embedding the shims to call inside descriptors
rather than communicated at runtime. This means that we're accessing
constant values in the generated JS and we can just update the constant
value accessed.