This commit starts the `wasm-bindgen` CLI tool down the road to being a
true polyfill for WebIDL bindings. This refactor is probably the first
of a few, but is hopefully the largest and most sprawling and everything
will be a bit more targeted from here on out.
The goal of this refactoring is to separate out the massive
`crates/cli-support/src/js/mod.rs` into a number of separate pieces of
functionality. It currently takes care of basically everything
including:
* Binding intrinsics
* Handling anyref transformations
* Generating all JS for imports/exports
* All the logic for how to import and how to name imports
* Execution and management of wasm-bindgen closures
Many of these are separable concerns and most overlap with WebIDL
bindings. The internal refactoring here is intended to make it more
clear who's responsible for what as well as making some existing
operations much more straightforward. At a high-level, the following
changes are done:
1. A `src/webidl.rs` module is introduced. The purpose of this module is
to take all of the raw wasm-bindgen custom sections from the module
and transform them into a WebIDL bindings section.
This module has a placeholder `WebidlCustomSection` which is nowhere
near the actual custom section but if you squint is in theory very
similar. It's hoped that this will eventually become the true WebIDL
custom section, currently being developed in an external crate.
Currently, however, the WebIDL bindings custom section only covers a
subset of the functionality we export to wasm-bindgen users. To avoid
leaving them high and dry this module also contains an auxiliary
custom section named `WasmBindgenAux`. This custom section isn't
intended to have a binary format, but is intended to represent a
theoretical custom section necessary to couple with WebIDL bindings to
achieve all our desired functionality in `wasm-bindgen`. It'll never
be standardized, but it'll also never be serialized :)
2. The `src/webidl.rs` module now takes over quite a bit of
functionality from `src/js/mod.rs`. Namely it handles synthesis of an
`export_map` and an `import_map` mapping export/import IDs to exactly
what's expected to be hooked up there. This does not include type
information (as that's in the bindings section) but rather includes
things like "this is the method of class A" or "this import is from
module `foo`" and things like that. These could arguably be subsumed
by future JS features as well, but that's for another time!
3. All handling of wasm-bindgen "descriptor functions" now happens in a
dedicated `src/descriptors.rs` module. The output of this module is
its own custom section (intended to be immediately consumed by the
WebIDL module) which is in theory what we want to ourselves emit one
day but rustc isn't capable of doing so right now.
4. Invocations and generations of imports are completely overhauled.
Using the `import_map` generated in the WebIDL step all imports are
now handled much more precisely in one location rather than
haphazardly throughout the module. This means we have precise
information about each import of the module and we only modify
exactly what we're looking at. This also vastly simplifies intrinsic
generation since it's all simply a codegen part of the `rust2js.rs`
module now.
5. Handling of direct imports which don't have a JS shim generated is
slightly different from before and is intended to be
future-compatible with WebIDL bindings in its full glory, but we'll
need to update it to handle cases for constructors and method calls
eventually as well.
6. Intrinsic definitions now live in their own file (`src/intrinsic.rs`)
and have a separated definition for their symbol name and signature.
The actual implementation of each intrinsic lives in `rust2js.rs`
There's a number of TODO items to finish before this merges. This
includes reimplementing the anyref pass and actually implementing import
maps for other targets. Those will come soon in follow-up commits, but
the entire `tests/wasm/main.rs` suite is currently passing and this
seems like a good checkpoint.
LLVM's mergefunc pass may mean that the same descriptor function is used
for different closure invocation sites even when the closure itself is
different. This typically only happens with LTO but in theory could
happen at any time!
The assert was tripping when we tried to delete the same function table
entry twice, so instead of a `Vec<usize>` of entries to delete this
commit switches to a `HashSet<usize>` which should do the deduplication
for us and enusre that we delete each descriptor only once.
Closes#1264
This commit moves `wasm-bindgen` the CLI tool from internally using
`parity-wasm` for wasm parsing/serialization to instead use `walrus`.
The `walrus` crate is something we've been working on recently with an
aim to replace the usage of `parity-wasm` in `wasm-bindgen` to make the
current CLI tool more maintainable as well as more future-proof.
The `walrus` crate provides a much nicer AST to work with as well as a
structured `Module`, whereas `parity-wasm` provides a very raw interface
to the wasm module which isn't really appropriate for our use case. The
many transformations and tweaks that wasm-bindgen does have a huge
amount of ad-hoc index management to carefully craft a final wasm
binary, but this is all entirely taken care for us with the `walrus`
crate.
Additionally, `wasm-bindgen` will ingest and rewrite the wasm file,
often changing the binary offsets of functions. Eventually with DWARF
debug information we'll need to be sure to preserve the debug
information throughout the transformations that `wasm-bindgen` does
today. This is practically impossible to do with the `parity-wasm`
architecture, but `walrus` was designed from the get-go to solve this
problem transparently in the `walrus` crate itself. (it doesn't today,
but this is planned work)
It is the intention that this does not end up regressing any
`wasm-bindgen` use cases, neither in functionality or in speed. As a
large change and refactoring, however, it's likely that at least
something will arise! We'll want to continue to remain vigilant to any
issues that come up with this commit.
Note that the `gc` crate has been deleted as part of this change, as the
`gc` crate is no longer necessary since `walrus` does it automatically.
Additionally the `gc` crate was one of the main problems with preserving
debug information as it often deletes wasm items!
Finally, this also starts moving crates to the 2018 edition where
necessary since `walrus` requires the 2018 edition, and in general it's
more pleasant to work within the 2018 edition!
Currently closure shims are communicated to JS at runtime, although at
runtime the same constant value is always passed to JS! More pressing,
however, work in #1002 requires knowledge of closure descriptor indices
at `wasm-bindgen` time which is not currently known.
Since the closure descriptor shims and such are already constant values,
this commit moves the descriptor function indices into the *descriptor*
for a closure/function pointer. This way we can learn about these values
at `wasm-bindgen` time instead of only knowing them at runtime.
This should have no semantic change on users of `wasm-bindgen`, although
some closure invocations may be slightly speedier because there's less
arguments being transferred over the boundary. Overall though this will
help #1002 as the closure shims that the Rust compiler generates may not
be the exact ones we hand out to JS, but rather wrappers around them
which do `anyref` business things.
This commit improves the codegen for `Closure<T>`, primarily for ZST
where the closure doesn't actually capture anything. Previously
`wasm-bindgen` would unconditionally allocate an `Rc` for a fat pointer,
meaning that it would always hit the allocator even when the `Box<T>`
didn't actually contain an allocation. Now the reference count for the
closure is stored on the JS object rather than in Rust.
Some more advanced tests were added along the way to ensure that
functionality didn't regress, and otherwise the calling convention for
`Closure` changed a good deal but should still be the same user-facing.
The primary change was that the reference count reaching zero may cause
JS to need to run the destructor. It simply returns this information in
`Drop for Closure` and otherwise when calling it now also retains a
function pointer that runs the destructor.
Closes#874
Previously `wasm-bindgen` would take its `breaks_if_inlined` shims and
attempt to remove them entirely, replacing calls to `breaks_if_inlined`
to the imported closure factories. This worked great in that it would
remove the `breaks_if_inlined` funtion entirely, removing the "cost" of
the `#[inline(never)]`.
Unfortunately as #864 discovered this is "too clever by half". LLVM's
aggressive optimizations won't inline `breaks_if_inlined`, but it may
still change the ABI! We can't replace calls to `breaks_if_inlined` if
the signature changes, because the function its calling has a fixed signature.
This commit cops out a bit and instead of replacing calls to
`breaks_if_inlined` to the imported closure factories, we instead
rewrite calls to `__wbindgen_describe_closure` to the closure factories.
This means that the `breaks_if_inlined` shims do not get removed. It
also means that the closure factory shims have a third and final
argument (what would be the function pointer of the descriptor function)
which is dead and unused.
This should be a functional solution for now and let us iterate on a
true fix later on (if needed). For now the cost of this
`#[inline(never)]` and the extra unused argument should be quite small.
Closes#864
This commit adds an implementation of `AsRef<JsValue>` for the `Closure<T>`
type. Previously this was not possible because the `JsValue` didn't actually
exist until the closure was passed to JS, but the implementation has been
changed to ... something a bit more unconventional. The end result, however, is
that `Closure<T>` now always contains a `JsValue`.
The end result of this work is intended to be a precursor to binding callbacks
in `web-sys` as `JsValue` everywhere but still allowing usage with `Closure<T>`.