Blog Post

WebAssembly: A New Hope

Illustration: WebAssembly: A New Hope
Information

This article was first published in August 2017 and was updated in August 2024.

In March 2017, the WebAssembly Community Group reached a consensus on the initial (MVP) binary format, JavaScript API, and reference interpreter. This marked the beginning of a new era for high-performance computing in web browsers. Today, WebAssembly has become a critical component of the web ecosystem, enabling developers to run complex applications in the browser with near-native performance. But what exactly is WebAssembly, and how has it evolved over the years?

Information

Don’t miss our follow-up post, Optimizing WebAssembly Startup Time

The Evolution of WebAssembly

Since its initial launch, WebAssembly (Wasm) has seen widespread adoption across all major browsers, including Chrome, Firefox, Safari, and Edge. With the addition of features like threads, SIMD, and reference types, WebAssembly is no longer just a low-level language for performance-critical code — it has become a powerful platform for building sophisticated web applications.

The Dream of High-Performance Computing in a Secure Browser

A slide from Alon Zakai’s [presentation about Emscripten][] in 2013.

In 2013, Alon Zakai from Mozilla developed a compiler called Emscripten, which he used to compile games written in C and C++ to a subset of JavaScript. This subset, called asm.js, was created by Luke Wagner and is aimed at bringing extraordinary optimizations to the language.

By making use of the asm pragma ("use asm";) and nifty typing hints, asm.js allows JavaScript interpreters that support it to use low-level CPU operations instead of the more expensive JavaScript. This makes it possible to bypass a lot of the difficult-to-optimize routines, like coercion and garbage collection, in situations where we know we don’t need it. And if the interpreter doesn’t offer this support, the code will still execute with identical results, albeit more slowly.

A Primer to asm.js

asm.js specifies the use of type hints under the hood. For example, a | 0 is used to hint that a is a 32-bit integer or +a is a double (64-bit floating point). The former works because the spec defines bitwise operators to operate on a sequence of 32 bits. These expressions have no side effects and can thus be inserted wherever it’s necessary to hint the type — either in arguments to a function call, or in the function call’s return value.

In the following example, an asm.js-optimized JavaScript interpreter might only execute a single 32-bit addition when the add export is called, whereas an interpreter that doesn’t support asm.js will have to execute many more instructions to fully follow the ECMAScript specification, because it has no advance knowledge of the types being passed to the function:

function AsmModule() {
	'use asm';
	return {
		add: function (a, b) {
			a = a | 0;
			b = b | 0;
			return (a + b) | 0;
		},
	};
}

To avoid expensive garbage collection routines, memory management is delegated entirely to the application, much like in a typical native application where the code must allocate and deallocate memory directly from the assigned part of RAM. To implement this, a large memory buffer is allocated upfront and then used throughout the asm.js block. The asm.js code creates typed views to access slices of that buffer and use them as memory:

var heap = new ArrayBuffer(0x100000); // 128kb
var pointer = 0x100;
var view = new Int32Array(heap, pointer, 0x100); // 256 bytes at offset 256
view[0] = 327;
view[1] = 1138;

In a block marked as "use asm";, all advanced JavaScript features can be deactivated until a violation occurs (for example, until a reference to an object is cleared).

Typically, asm.js code isn’t written by hand, rather it’s the result of compiling code from another language — usually C or C++. To create asm.js-optimized code, Emscripten — an LLVM-to-JavaScript compiler tool — was created. LLVM is a popular tool in many native development toolchains. It defines an intermediate representation (LLVM IR) that sets out a low-level language similar to assembly. In native development steps, this intermediate code can already be heavily optimized and easily translated to the target architecture (i.e. the CPU architecture/instruction set the code should be run on, like x86 or ARM). Emscripten reads this intermediate representation and translates it to asm.js, with additional optimization steps in the middle.

Before asm.js code is generated, LLVM intermediate representation is generated.

While asm.js makes it possible to improve the execution speed significantly and allows low-level languages with no concept of garbage collection, like C/C++, to compile to the web, it unfortunately comes with a few downsides:

  1. Type hints and the JavaScript syntax can result in very large asm.js files.

  2. It needs to be parsed like JavaScript, which can be expensive on lower-end devices like mobile phones.

  3. Since asm.js needs to be valid JavaScript, adding new features to it is very complex and affects JavaScript as well.

  4. Growing the initial heap at runtime is expensive, since ArrayBuffers, which are used to store the heap, are immutable. To solve this, one must create a new, larger ArrayBuffer and copy the content from the first buffer into the second one. This operation results in an asm.js violation, which is why Emscripten warns about disabled optimizations.

The New Kid on the Block

To solve all the above issues, the development of WebAssembly, or Wasm — “a new, portable, size- and load-time-efficient format suitable for compilation to the web” — started in 2015. The first version currently being deployed to all major browsers (Chrome, Firefox, Safari, and Edge) is a replacement for asm.js and comes with all of the same features. Meanwhile, other more powerful features, such as threads, are planned for future versions. It’s important here to point out that WebAssembly is designed to complement JavaScript, not replace it, and that in a browser context, it can only access the DOM via JavaScript and not directly.

WebAssembly support is rolling out this year in all major browsers. PSPDFKit for Web also supports older browsers by falling back to asm.js.

WebAssembly consists of four key concepts:

  1. Module — A compiled WebAssembly unit. Similar to an ES2015 module, a WebAssembly module declares imports and exports to the JavaScript language.

  2. Memory — A growable ArrayBuffer.

  3. Table — An array to store function references. This offers another way to access JavaScript functions inside WebAssembly, since these functions cannot be stored directly in the memory and called that way. Instead, a function will be stored in a table and can be called with its index. A table can be mutated by the host environment (JavaScript).

  4. Instance — A stateful, initialized module that’s connected to both a memory and a table object.

Additionally, WebAssembly defines a binary representation for the language code that’s similar to LLVM IR and needs to be compiled to the host architecture before it can be used. Some implementations, such as Microsoft’s Chakra, use a just-in-time (JIT) strategy to compile to the native host architecture, and others, such as Google’s Chrome, compile the entire module upfront. To avoid compilation every time a WebAssembly module is requested, the resulting module can also be persisted on a client using IndexedDB:

var importObject = {
	imports: {
		imported_func: function (arg) {
			console.log(arg);
		},
	},
};

fetch('application.wasm')
	.then((response) => response.arrayBuffer())
	.then((bytes) => WebAssembly.instantiate(bytes, importObject))
	.then((result) => result.instance.exports.exported_func());

With this design, WebAssembly solves all the issues with asm.js that were covered above:

  1. The file size is a lot smaller because of the binary representation. For our product, the WebAssembly version is about half the size of the asm.js build (about one-third for a gzipped build).

  2. WebAssembly improves execution time by making multiple steps in the engine’s pipeline faster. For example, parsing is greatly simplified, and the code is already in an intermediate format and just needs to be validated. Lin Clark wrote an excellent article about the reasons WebAssembly is fast.

  3. WebAssembly can be improved with new features independent of JavaScript. A good example of this is the SIMD (single instruction, multiple data) extension to JavaScript. This CPU-powered acceleration is a key optimization in modern assembly, but the impact on the JavaScript API is so big that plans to bring it there were abandoned because of its complexity. Instead, this feature will be added to WebAssembly directly with no JavaScript API.

  4. WebAssembly’s memory concept is based on the growable Memory class. This makes it possible to have more dynamic memory allocation.

WebAssembly is becoming the de facto solution for bringing native code to the web. With support from all major browsers and the development of a new LLVM backend inside the LLVM master branch, we’re looking at a bright future for the web.

Key Improvements Since 2017

  1. Broader browser support — WebAssembly is now fully supported across all major browsers and platforms, including mobile browsers. This universal support ensures that applications built with WebAssembly run consistently and efficiently everywhere.

  2. Advanced features — The introduction of multithreading, SIMD, and other advanced features has significantly improved the performance of WebAssembly applications, making it suitable for complex workloads like gaming, machine learning, and multimedia processing.

  3. Interoperability with JavaScript — Newer features, such as reference types, have enhanced the integration between WebAssembly and JavaScript, allowing for more seamless data exchange between the two environments.

  4. WebAssembly System Interface (WASI) — WASI provides a standardized API for running WebAssembly outside the browser, opening up new possibilities for server-side and edge computing.

WebAssembly at PSPDFKit

We recently released PSPDFKit for Web 2017.5, the first version of our web framework that supports standalone rendering, i.e. on the client without a daemon running on a server. To completely avoid a server component that can read a PDF, we worked hard to compile our 500,000 LOC C++ core to WebAssembly and asm.js. It’s extremely important to us that we can reuse the PDF-rendering code across all our modern platforms, because PDF rendering is hard to get right. Our shared core gives us the same low-level rendering and parsing of PDF documents everywhere and allows us to fully focus on one PDF engine.

The new PSPDFKit for Web now also includes four artifacts that are available next to pspdfkit.js and pspdfkit.css:

Filename Description
pspdfkit.wasm This file contains the WebAssembly binary code.
pspdfkit.wasm.js A small wrapper around the WebAssembly module to create a unified API that’s shared with asm.js.
pspdfkit.asm.js The asm.js build of our PDF backend.
pspdfkit.asm.js.mem A binary file that contains initial memory values for the asm.js build.

When the PDF viewer is initialized, we feature test the presence of WebAssembly, as well as some additional WebAssembly features, to decide how to initialize the native module.

Talking about an exciting new technology is one thing — but we want you to experience what WebAssembly made possible at PSPDFKit. The following demo of our PDF framework will use WebAssembly when it’s available and fall back to asm.js otherwise. While you were reading the article, we prepared everything. This is just an example of how easy it is to integrate PSPDFKit.

PSPDFKit for Web Using asm.js Fallback

To give you a comparison of the rendering performance between native, WebAssembly, and asm.js, we conducted an extensive benchmark across various devices.

A comparison of PDF-rendering performance across different devices when using native, wasm, and asm.js.

While the results are already impressive, we want to point out that WebAssembly is very new and improving at a fast rate. The new LLVM backend is still experimental, and while porting PSPDFKit to WebAssembly, we discovered a lot of edge cases we could only solve with the help of browser vendors. At this point, we want to issue a special thank you to the WebAssembly teams at Mozilla and Google, and especially to Alon Zakai, for being so helpful. With their help, we were able to navigate the edge cases we encountered and build our WebAssembly-based PDF viewer, even improving the Emscripten tool chain a bit along the way.

While we’re very optimistic about the current state of WebAssembly, we know that less capable systems still struggle with expensive rendering operations. For those cases, we recommend you check out our Server-backed product, which already enables super-fast PDF rendering, even on lower-end devices. In the future, we also want Server-backed installations to make use of WebAssembly and will enable progressive streaming of client-side rendered PDF documents on demand. We believe that a combination of server- and client-side technologies will offer the best experience for displaying PDF documents on the web, and we’re working hard to make this as seamless as possible.

Information

Don’t miss our follow-up post, Optimizing WebAssembly Startup Time

Since 2017, PSPDFKit has continuously improved its use of WebAssembly. Our latest release, PSPDFKit for Web 2024.5, leverages WebAssembly for rendering PDF documents with unparalleled speed and accuracy. By compiling our core C++ library to WebAssembly, we ensure that our PDF viewer provides a native-like experience directly in the browser.

WebAssembly has come a long way since its introduction, evolving from a promising technology into a cornerstone of modern web development. At PSPDFKit, we’re committed to leveraging the full potential of WebAssembly to deliver the best possible experience for our users.

Are you interested in WebAssembly and other web technologies and looking for a place to work that exercises your brain while preserving the work/life balance? Check out our current job offers.

FAQ

Here are a few frequently asked questions about WebAssembly.

What is WebAssembly and how does it improve web performance?

WebAssembly (Wasm) is a binary instruction format that enables high-performance applications to run in web browsers. It improves web performance by allowing code to execute at near-native speed, making it ideal for tasks like PDF rendering and complex computations.

How does WebAssembly differ from asm.js?

WebAssembly is a binary format that is more efficient and faster to parse than asm.js. While asm.js is a subset of JavaScript optimized for performance, WebAssembly is designed as a portable, low-level code format that provides better performance and smaller file sizes.

What are the key components of WebAssembly?

WebAssembly consists of four key components: Module (a compiled WebAssembly unit), Memory (a growable ArrayBuffer), Table (an array storing function references), and Instance (a stateful, initialized module connected to memory and table objects). These components work together to execute WebAssembly code efficiently.

How does PSPDFKit leverage WebAssembly for PDF rendering?

PSPDFKit uses WebAssembly to compile its C++ PDF rendering engine into a format that can run directly in the browser. This allows for standalone PDF rendering without the need for a server, resulting in faster and more efficient document processing.

What future features are planned for WebAssembly?

Future features for WebAssembly include support for threads, SIMD (single instruction, multiple data) extensions, and other performance enhancements. These features will further improve the capabilities of WebAssembly, making it even more powerful for web applications.

Explore related topics

Related products
Share post
Free trial Ready to get started?
Free trial