PDF rendering — Definition, how to render, and more

Philipp Spiess

Updated: November 5, 2024

PDF rendering — Definition, how to render, and more

TL;DR

This tutorial explores Mozilla’s PDF.js library, demonstrating two implementation approaches: direct canvas rendering for precise control, and iframe embedding for a ready-to-use viewer. PDF.js uses JavaScript to render PDFs into HTML5 canvas elements without plugins, working across three layers (core parsing, display API, and viewer UI). While free and flexible, it lacks enterprise features like annotation editing, reliable form handling, advanced view options, and digital signatures, making it suitable for basic viewing needs, but limited for complex document workflows.

PDF files are commonly used in many businesses today — whether you want to generate sales reports, deliver contracts, or send invoices, PDF is the file type of choice. In an earlier post, we looked at a native solution that works on many browsers without the use of JavaScript or any third-party browser plugins.

What is PDF rendering?

To visually present a PDF file, its content must first be converted into an image format suitable for display. PDF rendering refers to the process of transforming a PDF into an image that can be viewed on a screen.

To achieve this, a PDF library must first decompress the binary PDF file and interpret its structure. Then, the rendering engine translates the parsed data into graphical drawing operations.

Typically, the visual elements within a PDF document are represented using one of two formats: raster or vector graphics.

This tutorial will examine one of the most popular open source libraries for rendering PDF files in the browser: PDF.js. This library is versatile and allows for creating PDF files both in the browser and on the server. We’ll walk you through how to render a PDF and how to embed a PDF viewer in the browser, and at the end, we’ll discuss cases in which you should opt for a commercial Javascript PDF viewer.

Introduction to PDF.js

PDF.js is a robust JavaScript library developed by Mozilla, designed to render PDF files directly within web applications. This powerful tool allows developers to create, view, and interact with PDF documents seamlessly in a web browser, eliminating the need for external plugins or software. Whether you’re dealing with simple text-based PDFs or complex layouts with intricate graphics, PDF.js handles it all with ease.

One of the standout benefits of PDF.js is its ability to integrate PDF viewing capabilities directly into a web application. This means you don’t need a separate PDF viewer, making it ideal for businesses and organizations that need to present PDF files to their users or customers. Moreover, PDF.js is highly customizable, allowing developers to tailor the appearance and behavior of the PDF viewer to meet specific requirements.

In summary, PDF.js is a versatile and powerful library for working with PDF files in web applications. Its efficient rendering, combined with a range of interactive features, makes it an ideal solution for a wide array of use cases.

Displaying a PDF in the browser with PDF.js

PDF.js(opens in a new tab) is a JavaScript library written by Mozilla. Since it implements PDF rendering in vanilla JavaScript, it has cross-browser compatibility and doesn’t require additional plugins to be installed. We suggest carefully testing correctness, as there are many known problems with the render fidelity of PDF.js.

How does PDF.js handle PDFs?

With PDF.js, PDFs are downloaded via AJAX and rendered in a <canvas> element using native drawing commands. To improve performance, a lot of the processing work happens in a web worker(opens in a new tab), where the work of the core layer usually takes place.

PDF.js consists of three different layers:

Core — The binary format of a PDF is interpreted in this layer. Using the layer directly is considered advanced usage.
Display — This layer builds upon the core layer and exposes an easy-to-use interface for most day-to-day work.
Viewer — In addition to providing a programmatic API, PDF.js also comes with a ready-to-use user interface that includes support for search, rotation, a thumbnail sidebar, and many other things.

To get started, all you need to do is to download a recent copy(opens in a new tab) of PDF.js and you’re good to go!

Rendering a PDF

To render a specific page of a PDF into a <canvas> element, use the display layer of PDF.js. After downloading PDF.js, extract the necessary files — pdf.mjs and pdf.worker.mjs — from the build/ folder. These files will enable you to build a simple PDF renderer.

Create an empty directory, move the two files into it, and add your simple.js and simple.html files.

HTML setup

The HTML file needs to point to the pdf.mjs source code and your custom application code (simple.js). Additionally, you need to create a <canvas> element where the first page of the PDF will be rendered:

<!DOCTYPE html>
<html lang="en">
  <head>
    <meta charset="UTF-8" />
    <title>PDF.js Example</title>
    <script type="module" src="/simple.js"></script>
  </head>
  <body>
    <canvas id="pdf"></canvas>
  </body>
</html>

JavaScript setup

Now, use the PDF.js API in simple.js to render the first page. The getDocument(url) method loads the PDF, and from there, you can access pages using the getPage(pageNumber) method. The getViewport(scale) method provides the dimensions of the page, which you’ll use to size the <canvas> element accordingly. You also need to specify the worker file explicitly:

import { getDocument, GlobalWorkerOptions } from '/pdf.mjs';

// Specify the path to the PDF worker script
GlobalWorkerOptions.workerSrc = '/pdf.worker.mjs';

(async () => {
  const loadingTask = getDocument('/test.pdf');
  const pdf = await loadingTask.promise;

  // Load the first page.
  const page = await pdf.getPage(1);

  const scale = 1;
  const viewport = page.getViewport({ scale });

  // Get the canvas element and set its size based on the PDF page.
  const canvas = document.getElementById('pdf');
  const context = canvas.getContext('2d');
  canvas.height = viewport.height;
  canvas.width = viewport.width;

  // Render the page into the canvas.
  const renderContext = {
    canvasContext: context,
    viewport: viewport,
  };
  await page.render(renderContext).promise;
  console.log('Page rendered!');
})();

Running the example

Make sure to place a PDF file named test.pdf in the same directory. To run the code, you need a local web server. If you’re using Python, you can start a server in the test directory with this command:

python3 -m http.server 8000

Next, open the example at localhost:8000/simple.html(opens in a new tab).

PDF.js rendering the first page of a document in the browser

Embedding the PDF viewer in an HTML window

While the display layer provides fine-grained control over which parts of a PDF document are rendered, there are times when you might prefer a ready-to-use viewer. Luckily, PDF.js has you covered. In this part, you’ll integrate the PDF.js default viewer into your website.

Looking at the downloaded files, you’ll see another directory, web/. In this directory, you can find all necessary files for the viewer. Copy the entire folder into a new directory called viewer/, which results in files like viewer/web/viewer.mjs.

Just like in the previous example, you need the JavaScript files of PDF.js. To set this up correctly, create the build/ folder inside viewer/ and copy the files there, which results in viewer/build/pdf.mjs and viewer/build/pdf.worker.mjs.

You can now work on the integration. To do this, create a simple HTML file that will include the viewer via an <iframe>. This allows you to embed the viewer into an existing webpage very easily. The viewer is configured via URL parameters, a list of which can be found here(opens in a new tab). For this example, you’ll only configure the source PDF file. For more advanced features (like saving the PDF document to your web server again), you can start modifying the viewer.html file provided by PDF.js:

<!DOCTYPE html>
<html>
  <head>
    <meta charset="UTF-8" />
    <title>PDF.js Example</title>
  </head>
  <body>
    <iframe
      src="/web/viewer.html?file=/test.pdf"
      width="800px"
      height="600px"
      style="border: none;"
    />
  </body>
</html>

This tiny block of HTML is indeed all that’s needed to start working, and all of the PDF.js code is handled conveniently via the viewer.html file delivered by PDF.js. Now copy a PDF file to viewer/test.pdf and start a simple HTTP server inside the viewer/ directory again.

PDF.js viewer in the browser

Conclusion

The three layers of PDF.js are nicely separated, allowing you to focus on the parts you really need. The core layer handles the heavy PDF.js parsing — an operation which usually takes place in a web worker(opens in a new tab). This helps keep the main thread responsive at all times. The display layer exposes an interface to easily render a PDF, and you can use this API to render a page into a <canvas> element with only a couple of lines of JavaScript. The third layer, viewer, builds upon the other layers and provides a simple but effective user interface for showing PDF documents in a web browser.

All in all, PDF.js is a great solution for many use cases. However, sometimes your business requires more complex features, such as the following, for handling PDFs in the browser:

PDF annotation support — PDF.js will only render annotations that were already in the source file, and you can use the core API to access raw annotation data. It doesn’t have annotation editing support, so your users won’t be able to create, update, or delete annotations to review a PDF document.
PDF form filling — While PDF.js has started working with interactive forms, our testing found that there are still a lot of issues left open. For example, form buttons and actions aren’t supported, making it impossible to submit forms to your web service.
Mobile support — PDF.js comes with a clean mobile interface, but it misses features that provide a great user experience and are expected nowadays, like pinch-to-zoom. Additionally, downloading an entire PDF document for mobile devices might result in a big performance penalty.
Persistent management — With PDF.js, there’s no option to easily share, edit, and annotate PDF documents across a broad variety of devices (whether it be other web browsers, native apps, or more). If your business relies on this service, consider looking into a dedicated annotation syncing framework like Nutrient Instant.
Digital signatures — PDF.js currently has no support for digital signatures, which are used to verify the authenticity of a filled-out PDF.
Advanced view options — The PDF.js viewer only supports a continuous page view mode, wherein all pages are laid out in a list and the user can scroll through them vertically. Single- or double-page modes — where only one (or two) pages are shown at once (a common option to make it easier to read books or magazines) — aren’t possible.
Render fidelity - There are many known problems with the render fidelity of PDF.js, where PDFs look different, have different colors or shapes, or even miss parts of the document altogether.

If your business relies on any of the above features, consider looking into alternatives. We at Nutrient work on the next generation of PDF viewers for the web. Together with Nutrient Instant, we offer an enterprise-ready PDF solution for web browsers and other platforms, along with industry-leading first-class support included with every plan. Launch our Web demo to see Nutrient Web SDK in action.

FAQ

What is PDF.js and how does it work?

PDF.js is an open source JavaScript library that allows you to render PDF files directly in the browser. It uses HTML5, canvas, and other web technologies to display PDF documents without needing plugins.

How do I integrate PDF.js into my web application?

To integrate PDF.js, download the library from the official repository, include the necessary scripts in your HTML file, and use JavaScript to load and render PDF files onto a canvas element.

Can I customize the appearance and functionality of the PDF viewer with PDF.js?

Yes, PDF.js provides a flexible API that allows you to customize the viewer’s appearance and add additional features like custom toolbars, navigation buttons, and annotation tools.

What are the performance considerations when using PDF.js?

Performance considerations include optimizing PDF rendering for large or complex documents, ensuring smooth navigation and zooming, and managing memory usage to prevent slowdowns or crashes.

Can PDF.js be used for creating PDF files?

No, PDF.js is primarily designed for rendering PDF files in the browser. For creating PDF files, you might want to explore other libraries that are specifically designed for generating PDF documents in the browser.