zed/crates/plugin_runtime/OPAQUE.md
Marshall Bowers aff119b80a
Fix possessive "its" in docs and comments (#3998)
This PR fixes a number of places where we were incorrectly using "it's"
where we needed to use the possessive "its".

Release Notes:

- N/A
2024-01-10 10:09:48 -05:00

12 KiB

Opaque handles to resources

Currently, Zed's plugin system only supports moving data (e.g. things you can serialize) across the boundary between guest-side plugin and host-side runtime. Resources, things you can't just copy, have been set aside for now. Given how important this is to Zed, I think it's about time we address this.

Managing resources is very important to Zed, because a lot of what Zed does is exactly that—managing resources. Each open buffer you're editing is a resource, as is the language server you're querying, or the collaboration session you're currently in. Therefore, writing a plugin system with deep integration with Zed requires some mechanism to manage resources.

The reason resources are problematic is because, unlike data, we can't pass resources across the ABI boundary. Wasm can't take references to host memory (and even if it could, that doesn't mean that it's a good idea). To add support for resources to plugins, we'd need three things:

  1. Some sort of way for the host-side runtime to hang onto references to a resource. If the plugin requests to modify a resource, but we don't even know where that resource is, that's kinda bad, isn't it?

  2. Some sort of way for the guest-side runtime to hang onto handles to a resource. We can't reference the resource directly from a plugin, but if a resource has been registered with the runtime, we can at least take a runtime-provided handle to that resource so that we may request that the runtime modify it in the future.

  3. Some sort of way to modify the resources we're holding onto. This requires two things: some way for a plugin to request a modification, and some for the runtime to apply that modification. Here I'm using 'modification' in the most general sense, which includes, e.g. reading or writing to the resource, i.e. calling a method on it.

Luckily for us, managing resources across boundaries is a problem that languages have had to deal with for eons. File descriptors referencing resources managed by the kernel quintessentially defines of resource management, but this pattern is oft repeated in games, scripting languages, or surprise surprise, when writing plugins.

To see what managing resources in plugins could look like in Rust, we need look no further than Rhai. Rhai is a scripting language powered by a tree-walk interpreter written in Rust. It's pretty neat, but what we care about is not the language itself, but how it interfaces with Rust types.

In its guide, Rhai claims the following:

Rhai works seamlessly with any Rust type, as long as it implements Clone as this allows the Engine to pass by value.

This doesn't mean that the underlying resources themselves need to be copied:

Because Rhai works with types implementing `Clone`

Given that we have to register a resource with our plugin runtime before we use it, requiring the resource to be behind a shared reference makes sense, so I think the Clone bound is reasonable. So how does Rhai represent types under the hood?

A custom type is stored in Rhai as a Rust trait object (specifically, a dyn rhai::Variant), with no restrictions other than being Clone (plus Send + Sync under the sync feature).

I'd be interested to know how Rhai disambiguates between different types if everything's a trait object under the hood.

Rhai actually exposes a pretty nice interface for working with native Rust types. We can register a type using Engine::register_type::<T: Variant + Clone>(). Internally, this just grabs the string name of the type for future reference.

Note

: Rhai uses strings, but I wonder if you could get away with something more compact using TypeIds. Maybe not, given that TypeIds are not deterministic across builds, and we'd need matching IDs both host-side and guest side.

In Rhai, we can alternatively use the method Engine::register_type_with_name::<T: Variant + Clone>(name: &str) if we have a different type name host-side (in Rust) and guest-side (in Rhai).

With respect to Wasm plugins, I think an interface like this is fairly important, because we don't know whether the original plugin was written in Rust. (This may not be true now, because we write all the plugins Zed uses, but once we allow packaging and shipping plugins, it's important to maintain a consistent interface, because even Rust changes over time.)

Once we've registered a type, we can begin using this type in functions. We can add new function using the standard Engine::register_fn function, which has the following signature:

pub fn register_fn<N, A, F>(&mut self, name: N, func: F) -> &mut Self
where
    N: AsRef<str> + Into<Identifier>,
    F: RegisterNativeFunction<A, ()>,

This is quite complex, but under the hood it's fairly similar to our own PluginBuilder::host_function async method. Looking at RegisterNativeFunction, it seems as though this trait essentially provides methods that expose the TypeIDs and type/param names of the arguments and return types of the function.

So once we register a function, what happens when we call it? Well, let me introduce you to my friend Engine::call_native_fn, whose type signature is too complex to list here.

Note

: Finding this function took like 7 levels of indirection from eval. It's surprising how much shuffling of data Rhai does under the hood, I bet you could probably make it a lot faster.

This takes and returns, like everything else in Rhai, an object of type Dynamic. We know that we can use native Rust types, so how does Rhai perform the conversion to and from Dynamic?

The secret lies in Dynamic::try_cast::<T: Any>(self) -> Option<T>. Like most dynamic scripting languages, Rhai uses a tagged Union to represent types. Remember dyn Variant from earlier? Rhai's Union has a variant, Variant, to hold the dynamic native types:

/// Any type as a trait object.
#[allow(clippy::redundant_allocation)]
Variant(Box<Box<dyn Variant>>, Tag, AccessMode),

Redundant allocations aside, To try_cast a Dynamic type to T: Anything, we pattern match on Union. In the case of variant, we:

Union::Variant(v, ..) => (*v).as_boxed_any().downcast().ok().map(|x| *x),

Now Rhai can do this because it's implemented in Rust. In other words, unlike Wasm, Rhai scripts can, indirectly, hold references to places in host memory. For us to implement something like this for Wasm plugins, we'd have to keep track of a "ResourcePool"—alive for the duration of each function call—that we can check rust types into and out of.

I think I've got a handle on how Rhai works now, so let's stop talking about Rhai and discuss what this opaque object system would look like if we implemented it in Rust.

Design Sketch

First things first, we'd have to generalize the arguments we can pass to and return from functions host-side. Currently, we support anything that's serdeable. We'd have to create a new trait, say Value, that has blanket implementations for both serde and Clone (or something like this; if a type is both serde and clone, we'd have to figure out a way to disambiguate).

We'd also create a ResourcePool struct that essentially is a Vec of Box<dyn Any>. When calling a function, all Value arguments that are resources (e.g. Clone instead of serde) would be typecasted to dyn Any and stored in the ResourcePool.

We'd probably also need a Resource trait that defines an associated handle for a resource. Something like this:

pub trait Resource {
   type Handle: Serialize + DeserializeOwned;
   fn handle(index: u32) -> Self;
   fn index(handle: Self) -> u32;
}

Where a handle is just a dead-simple wrapper around a u32:

#[derive(Serialize, Deserialize)]
pub struct CoolHandle(u32);

It's important that this handle be accessible both host-side and plugin side. I don't know if this means that we have another crate, like plugin_handles, that contains a bunch of u32 wrappers, or something else. Because a Resource::Handle is just a u32, it's trivially serde, and can cross the ABI boundary.

So when we add each T: Resource to the ResourcePool, the resource pool typecasts it to Any, appends it to the Vec, and returns the associated Resource::Handle. This handle is what we pass through to Wasm.

// Implementations and attributes omitted
pub struct Rope { ... };
pub struct RopeHandle(u32);
impl Resource for Arc<RwLock<Rope>> { ... }

let builder: PluginBuilder = ...;
let builder = builder
   .host_fn_async(
       "append",
       |(rope, string): (Arc<RwLock<Rope>>, &str)| async move {
           rope.write().await.append(Rope::from(string))
       }
   )
   // ...

He're we're providing a host function, append that can be called from Wasm. To import this function into a plugin, we'd do something like the following:

use plugin::prelude::*;
use plugin_handles::RopeHandle;

#[import]
pub fn append(rope: RopeHandle, string: &str);

This allows us to perform an operation on a Rope, but how do we get a RopeHandle into a plugin? Well, as plugins, we can only acquire resources to handles we're given, so we'd need to expose a function that takes a handle.

To illustrate that point, here's an example. First, we'd define a plugin-side function as follows:

// same file as above ...

#[export]
pub fn append_newline(rope: RopeHandle){
    append(rope, "\n");
}

Host-side, we'd treat this function like any other:

pub struct NewlineAppenderPlugin {
    append_newline: WasiFn<Arc<RwLock<Rope>>, ()>,
    runtime: Arc<Mutex<Plugin>>,
}

To call this function, we'd do the following:

let plugin: NewlineAppenderPlugin = ...;
let rope = Arc::new(RwLock::new(Rope::from("Hello World")));

plugin.lock().await.call(
    &plugin.append_newline,
    rope.clone(),
).await?;

// `rope` is now "Hello World\n"

So here's what calling append_newline would do, from the top:

  1. First, we'd create a new ResourcePool, and insert the Arc<RwLock<Rope>>, creating a RopeHandle in the process. (We could also reuse a resource pool across calls, but the idea is that the pool only keeps track of resources for the duration of the call).

  2. Then, we'd call the Wasm plugin function append_newline, passing in the RopeHandle we created, which easily crosses the ABI boundary.

  3. Next, in Wasm, we call the native imported function append. This sends the RopeHandle back over the boundary, to Rust.

  4. Looking in the Plugin's ResourcePool, we'd convert the handle into an index, grab and downcast the dyn Any back into the type we need, and then call the async Rust callback with an Arc<RwLock<Rope>>.

  5. The Rust async callback actually acquires a lock and appends the newline.

  6. And from here on out we return up the callstack, through Wasm, to Rust all the way back to where we started. Right before we return, we clear out the ResourcePool, so that we're no longer holding onto the underlying resource.

Throughout this entire chain of calls, the resource remain host-side. By temporarily checking it into a ResourcePool, we're able to keep a reference to the resource that we can use, while avoiding copying the uncopyable resource.

Final Notes

Using this approach, it should be possible to add fairly good support for resources to Wasm. I've only done a little rough prototyping, so we're bound to run into some issues along the way, but I think this should be a good first approximation.

This next week, I'll try to get a production-ready version of this working, using the Language resource required by some Language Server Adapters.

Hope this guide made sense!