Blog Post

Rendering PDF files in the browser with PDF.js

Illustration: Rendering PDF files in the browser with PDF.js

PDF files are commonly used in many businesses today — whether you want to generate sales reports, deliver contracts, or send invoices, PDF is the file type of choice. In an earlier post, we looked at a native solution that works on many browsers without the use of JavaScript or any third-party browser plugins.

In this tutorial, we’ll go a bit deeper and examine one of the most popular open source libraries for rendering PDF files in the browser: PDF.js. We’ll walk you through how to render a PDF and how to embed a PDF viewer in the browser, and at the end, we’ll discuss cases in which you should opt for a commercial Javascript PDF viewer.

Displaying a PDF in the browser with PDF.js

PDF.js is a JavaScript library written by Mozilla. Since it implements PDF rendering in vanilla JavaScript, it has cross-browser compatibility and doesn’t require additional plugins to be installed. We suggest carefully testing correctness, as there are many known problems with the render fidelity of PDF.js.

How does PDF.js handle PDFs?

With PDF.js, PDFs are downloaded via AJAX and rendered in a <canvas> element using native drawing commands. To improve performance, a lot of the processing work happens in a web worker, where the work of the core layer usually takes place.

PDF.js consists of three different layers:

  • Core — The binary format of a PDF is interpreted in this layer. Using the layer directly is considered advanced usage.

  • Display — This layer builds upon the core layer and exposes an easy-to-use interface for most day-to-day work.

  • Viewer — In addition to providing a programmatic API, PDF.js also comes with a ready-to-use user interface that includes support for search, rotation, a thumbnail sidebar, and many other things.

To get started, all you need to do is to download a recent copy of PDF.js and you’re good to go!

PDF.js logo

Rendering a PDF

To render a specific page of a PDF into a <canvas> element, use the display layer of PDF.js. After downloading PDF.js, extract the necessary files — pdf.mjs and pdf.worker.mjs — from the build/ folder. These files will enable you to build a simple PDF renderer.

Create an empty directory, move the two files into it, and add your simple.js and simple.html files.

HTML setup

The HTML file needs to point to the pdf.mjs source code and your custom application code (simple.js). Additionally, you need to create a <canvas> element where the first page of the PDF will be rendered:

<!-- simple.html -->
<!DOCTYPE html>
<html lang="en">
	<head>
		<meta charset="UTF-8" />
		<title>PDF.js Example</title>
		<script type="module" src="/simple.js"></script>
	</head>
	<body>
		<canvas id="pdf"></canvas>
	</body>
</html>

JavaScript setup

Now, use the PDF.js API in simple.js to render the first page. The getDocument(url) method loads the PDF, and from there, you can access pages using the getPage(pageNumber) method. The getViewport(scale) method provides the dimensions of the page, which you’ll use to size the <canvas> element accordingly. You also need to specify the worker file explicitly:

// simple.js
import { getDocument, GlobalWorkerOptions } from '/pdf.mjs';

// Specify the path to the PDF worker script
GlobalWorkerOptions.workerSrc = '/pdf.worker.mjs';

(async () => {
	const loadingTask = getDocument('/test.pdf');
	const pdf = await loadingTask.promise;

	// Load the first page.
	const page = await pdf.getPage(1);

	const scale = 1;
	const viewport = page.getViewport({ scale });

	// Get the canvas element and set its size based on the PDF page.
	const canvas = document.getElementById('pdf');
	const context = canvas.getContext('2d');
	canvas.height = viewport.height;
	canvas.width = viewport.width;

	// Render the page into the canvas.
	const renderContext = {
		canvasContext: context,
		viewport: viewport,
	};
	await page.render(renderContext).promise;
	console.log('Page rendered!');
})();

Running the example

Make sure to place a PDF file named test.pdf in the same directory. To run the code, you need a local web server. If you’re using Python, you can start a server in the test directory with this command:

python3 -m http.server 8000

Next, open the example at localhost:8000/simple.html.

PDF.js rendering the first page of a document in the browser

Embedding the PDF viewer in an HTML window

While the display layer provides fine-grained control over which parts of a PDF document are rendered, there are times when you might prefer a ready-to-use viewer. Luckily, PDF.js has you covered. In this part, you’ll integrate the PDF.js default viewer into your website.

Looking at the downloaded files, you’ll see another directory, web/. In this directory, you can find all necessary files for the viewer. Copy the entire folder into a new directory called viewer/, which results in files like viewer/web/viewer.mjs.

Just like in the previous example, you need the JavaScript files of PDF.js. To set this up correctly, create the build/ folder inside viewer/ and copy the files there, which results in viewer/build/pdf.mjs and viewer/build/pdf.worker.mjs.

You can now work on the integration. To do this, create a simple HTML file that will include the viewer via an <iframe>. This allows you to embed the viewer into an existing webpage very easily. The viewer is configured via URL parameters, a list of which can be found here. For this example, you’ll only configure the source PDF file. For more advanced features (like saving the PDF document to your web server again), you can start modifying the viewer.html file provided by PDF.js:

<!DOCTYPE html>
<html>
	<head>
		<meta charset="UTF-8" />
		<title>PDF.js Example</title>
	</head>
	<body>
		<iframe
			src="/web/viewer.html?file=/test.pdf"
			width="800px"
			height="600px"
			style="border: none;"
		/>
	</body>
</html>

This tiny block of HTML is indeed all that’s needed to start working, and all of the PDF.js code is handled conveniently via the viewer.html file delivered by PDF.js. Now copy a PDF file to viewer/test.pdf and start a simple HTTP server inside the viewer/ directory again.

PDF.js viewer in the browser

Conclusion

The three layers of PDF.js are nicely separated, allowing you to focus on the parts you really need. The core layer handles the heavy PDF.js parsing — an operation which usually takes place in a web worker. This helps keep the main thread responsive at all times. The display layer exposes an interface to easily render a PDF, and you can use this API to render a page into a <canvas> element with only a couple of lines of JavaScript. The third layer, viewer, builds upon the other layers and provides a simple but effective user interface for showing PDF documents in a web browser.

All in all, PDF.js is a great solution for many use cases. However, sometimes your business requires more complex features, such as the following, for handling PDFs in the browser:

  • PDF annotation support — PDF.js will only render annotations that were already in the source file, and you can use the core API to access raw annotation data. It doesn’t have annotation editing support, so your users won’t be able to create, update, or delete annotations to review a PDF document.

  • PDF form filling — While PDF.js has started working with interactive forms, our testing found that there are still a lot of issues left open. For example, form buttons and actions aren’t supported, making it impossible to submit forms to your web service.

  • Mobile support — PDF.js comes with a clean mobile interface, but it misses features that provide a great user experience and are expected nowadays, like pinch-to-zoom. Additionally, downloading an entire PDF document for mobile devices might result in a big performance penalty.

  • Persistent management — With PDF.js, there’s no option to easily share, edit, and annotate PDF documents across a broad variety of devices (whether it be other web browsers, native apps, or more). If your business relies on this service, consider looking into a dedicated annotation syncing framework like Nutrient Instant.

  • Digital signatures — PDF.js currently has no support for digital signatures, which are used to verify the authenticity of a filled-out PDF.

  • Advanced view options — The PDF.js viewer only supports a continuous page view mode, wherein all pages are laid out in a list and the user can scroll through them vertically. Single- or double-page modes — where only one (or two) pages are shown at once (a common option to make it easier to read books or magazines) — aren’t possible.

  • Render fidelity - There are many known problems with the render fidelity of PDF.js, where PDFs look different, have different colors or shapes, or even miss parts of the document altogether.

If your business relies on any of the above features, consider looking into alternatives. We at Nutrient work on the next generation of PDF viewers for the web. Together with Nutrient Instant, we offer an enterprise-ready PDF solution for web browsers and other platforms, along with industry-leading first-class support included with every plan. Launch our Web demo to see Nutrient Web SDK in action.

FAQ

Here are a few frequently asked questions about rendering PDF files in the browser with PDF.js.

What is PDF.js and how does it work?

PDF.js is an open source JavaScript library that allows you to render PDF files directly in the browser. It uses HTML5, canvas, and other web technologies to display PDF documents without needing plugins.

How do I integrate PDF.js into my web application?

To integrate PDF.js, download the library from the official repository, include the necessary scripts in your HTML file, and use JavaScript to load and render PDF files onto a canvas element.

Can I customize the appearance and functionality of the PDF viewer with PDF.js?

Yes, PDF.js provides a flexible API that allows you to customize the viewer’s appearance and add additional features like custom toolbars, navigation buttons, and annotation tools.

What are the performance considerations when using PDF.js?

Performance considerations include optimizing PDF rendering for large or complex documents, ensuring smooth navigation and zooming, and managing memory usage to prevent slowdowns or crashes.

Explore related topics

Free trial Ready to get started?
Free trial