Description
trait SomeTrait {
callback(Self) -> Unit
}
struct Value {}
impl SomeTrait for Value with callback(self) -> Unit {
println("Value::callback")
}
fn example(value: &SomeTrait) -> Unit {
value.callback()
}
(type $SomeTrait
(sub
(struct
(field (ref $SomeTrait.method_0)))))
(type $Value.as_SomeTrait
(sub
$SomeTrait
(struct
(field (ref $SomeTrait.method_0))
(field (ref $Value)))))
Today when using Trait polymorphism (like defining a parameter as &Value
above) the compiler generates a new struct that's effectively a run-time vtable every time a struct that implements that trait is passed into such a function, like this:
(call $jayphelps/example/main.example
(struct.new $Value.as_SomeTrait
(ref.func $@jayphelps/example/main.SomeTrait::@jayphelps/example/main.Value::callback.dyncall_as_SomeTrait)
(local.get $value/123)))))
(; call it again or any other function which relies on trait polymorphism ;)
(call $jayphelps/example/main.example
(struct.new $Value.as_SomeTrait
(ref.func $@jayphelps/example/main.SomeTrait::@jayphelps/example/main.Value::callback.dyncall_as_SomeTrait)
(local.get $value/123)))))
In this example, if you pass the struct value to multiple functions a new intermediate vtable struct is created each time. It's somewhat contrived given there's only one method, but hopefully it demonstrates the point cause "at scale" in complex apps with lots of methods this could add up.
Off hand I don't think there's a way around having this separate struct since Wasm GC doesn't (yet) support subtyping multiple structs, but I'm curious if this can be memory optimized by placing all the functions in a statically defined Wasm (table)
, avoiding the overhead both of the extra allocations to store the functions? That table's index is then what you put in $Value.as_SomeTrait.
That said, this might come at the cost of CPU performance as when you need to use call_indirect
, which incurs bounds/type checks and lookup. But I'm not sure if call_ref
is optimized beyond call_indirect
yet in most VMs or not. One might argue memory is cheap these days, and to error on the side of better CPU perf. I guess you could also define the vtable still using a struct with ref.func
s but as a global, which would sort of be a middle ground perhaps. Faster calls, less redundant memory, but still some small overhead with the global lookup.
So I guess that leaves me more just curious what your thinking is around this. Mostly out of professional curiosity—you all have more compiler experience than I do, so I'd appreciate learning if this was deliberate, and why. I noticed this because my code is heavily relying on this feature, so I quickly saw a lot of redundant allocations when examining the build.