In Google Chrome M137, V8 introduced two powerful optimizations for WebAssembly: speculative call_indirect inlining and deoptimization support. Together, these techniques enable the compiler to generate faster machine code by making assumptions based on runtime feedback. This is especially beneficial for WebAssembly with garbage collection (WasmGC), which supports high-level managed languages like Java, Kotlin, and Dart. Early benchmarks show average speedups of over 50% on Dart microbenchmarks, and 1% to 8% on larger real-world applications. Deoptimization also lays the groundwork for future enhancements. Here we answer common questions about these optimizations.
What are speculative optimizations and why are they used in WebAssembly now?
Speculative optimizations involve making assumptions about program behavior based on past execution data. In JavaScript, this is common practice: a JIT compiler might assume a + b is an integer addition because earlier runs used integers. If the assumption later proves false, the engine performs a deoptimization (deopt) to revert to unoptimized code and collect new feedback.
Historically, WebAssembly (Wasm) didn't need speculation because its static typing and ahead-of-time compilation from C/C++/Rust already produced well-optimized binaries. However, the WasmGC proposal changed this. WasmGC introduces high-level types like structs, arrays, and subtyping, making WebAssembly more suitable for managed languages. These dynamic features benefit greatly from speculative optimizations, just as JavaScript does. By using runtime feedback, V8 can now inline functions that are likely to be called, generating more efficient code that runs faster than generic implementations.
How does deoptimization support work for WebAssembly in V8?
Deoptimization is the mechanism that allows V8 to safely undo speculative optimizations when the runtime behavior violates the assumptions made during compilation. For WebAssembly, this is a new capability. Previously, Wasm code was always generated conservatively to handle all possible scenarios, which could be slower.
Now, with deopt support, V8's Liftoff baseline compiler records metadata about where assumptions were made. If those assumptions break (e.g., a previously always-integer addition becomes a float operation), the engine can discard the optimized code and fall back to a slower but correct variant. This process is transparent to the developer and happens at the function granularity. The deoptimized code then collects new feedback, allowing V8 to reoptimize later with updated assumptions. This lifecycle mirrors how JavaScript optimizations work, making Wasm execution more adaptive and efficient when runtime patterns change.
What is speculative call_indirect inlining and how does it improve performance?
In WebAssembly, call_indirect is an instruction that calls a function through a table indirection, similar to virtual function calls in C++. Such indirect calls are inherently slower because the target must be looked up at runtime. Speculative call_indirect inlining is an optimization where, based on runtime profiling, V8 guesses the most frequent target of an indirect call and inlines that function directly into the caller.
This eliminates the indirection overhead and allows further optimizations like constant propagation and dead code elimination within the inlined function. If the guess turns out wrong later (i.e., a different function is called), V8 uses its deoptimization support to revert the inlined code and fall back to the original indirect call. This technique is particularly effective for WasmGC programs, where polymorphism is common (e.g., different implementations of an interface). By inlining the hot path, execution becomes much faster, often doubling performance in microbenchmarks.
Why is WasmGC particularly well-suited for these optimizations?
WasmGC extends WebAssembly with managed types like structs, arrays, and references that support subtyping and dynamic dispatch. This makes it possible to compile languages such as Java, Kotlin, and Dart to WebAssembly. However, these high-level features introduce runtime flexibility similar to JavaScript: a method call on a reference could resolve to many different implementations.
Without speculation, V8 would have to generate code that handles every possible type and method implementation, leading to slower execution. With speculative inlining and deopts, the engine can optimize for the most likely types and call targets based on observed behavior. For example, in a Dart application that repeatedly calls toString() on a string object, V8 can inline that specific method. If a different object appears, a deopt occurs and the system adapts. This enables WasmGC programs to achieve faster execution comparable to native virtual machine implementations, making WebAssembly a more viable target for high-level languages.
What performance improvements have been observed with these optimizations?
V8's team measured the combined impact of speculative inlining and deoptimization on a range of benchmarks. On a set of Dart microbenchmarks specifically designed to stress dynamic dispatch, the average speedup exceeded 50%. For larger, realistic applications—like complex GUI apps or data processing tools—the improvement was more modest but still meaningful, ranging from 1% to 8%.
These gains come primarily from reducing the overhead of indirect calls and enabling better register allocation and instruction scheduling after inlining. While 50% might sound enormous, it reflects targeted microbenchmarks; real-world applications often have mixed code where some parts benefit less. Nevertheless, the optimizations are significant because they close the performance gap between WebAssembly and native code for managed languages. Additionally, they serve as a foundation for future optimizations, meaning the benefits will likely grow as V8 continues to refine its WebAssembly compilation pipeline.
How do these optimizations compare to JavaScript’s speculative approach?
JavaScript has relied on speculative optimizations for years, using feedback from previous runs to generate highly optimized machine code. Deoptimization is a key part of this: when assumptions break, execution falls back to a slower interpreter or baseline compiler. V8’s new WebAssembly optimizations mirror this same strategy.
However, there are differences. JavaScript’s dynamic nature means speculations are more aggressive but also riskier. Wasm, being statically typed, provides more reliable feedback—especially for function types and indirect call targets. The overhead of deoptimization is similar, but because Wasm code is often denser and more predictable, the performance wins can be more consistent. In both cases, the engine gathers runtime type profiles, inlines likely targets, and uses deopts as a safety net. This convergence demonstrates how WebAssembly is evolving to support dynamic-language workloads without sacrificing the performance benefits of static compilation.
What does the future hold for WebAssembly optimizations in V8?
Deoptimization support and speculative inlining are just the beginning for WebAssembly in V8. They open the door to many more speculative techniques previously only available for JavaScript. For instance, V8 could use runtime-guided constant folding, branch prediction based on profiles, or even speculative array bounds check elimination.
Additionally, as WasmGC matures and more languages target it, feedback-driven optimizations will become even more crucial. The V8 team plans to extend profiling to cover more operations (like type tests and casts) and improve the granularity of deoptimization (e.g., per-instruction deoptimization). This will allow the engine to make bolder speculations with minimal penalty when wrong. Ultimately, these optimizations aim to make WebAssembly execution as fast as possible for all code—whether ported from C++ or compiled from modern managed languages—while maintaining the security and portability guarantees of the Wasm sandbox.