Generating PDF documents is a common requirement in many web applications. Whether it's for creating invoices, reports, or downloadable content, having an efficient solution in place can greatly streamline business processes. Node.js offers powerful libraries to generate PDFs, making it an excellent choice for developers looking to automate document creation in their applications.
data:image/s3,"s3://crabby-images/0f3cc/0f3cc5d435e57cddf2b0d734f80f2d97e1a286c1" alt="Illustration: Node.js PDF generator: How to generate PDFs from HTML with Node.js"
This post explores how to generate PDFs from HTML using Node.js with Puppeteer and Nutrient Document Engine. It’ll cover setting up Document Engine, preparing HTML content, and generating professional-quality PDFs.
Why use Node.js for PDF generation?
Node.js provides several advantages when it comes to generating PDFs:
-
Lightweight and fast — Node.js libraries are designed for efficient processing and integration.
-
Automation capabilities — PDF generation can be automated based on user input, database records, or scheduled tasks.
-
Customization — Developers can dynamically generate documents tailored to user needs.
Puppeteer and Nutrient Document Engine are two popular options for PDF generation. Puppeteer is ideal for automating the process of rendering HTML into PDF, while Nutrient Document Engine offers advanced features for enterprise-level PDF production.
Setting up PDF generation with Puppeteer
Step 1 — Installing dependencies
Start by initializing your project and installing Puppeteer for PDF generation:
mkdir html-to-pdf && cd html-to-pdf npm init -y npm install puppeteer
Step 2 — Creating the HTML template
Write an HTML template, template.html
, which will be rendered as a PDF:
Styling and layout customization are essential when converting HTML to PDF, especially if a document needs to adhere to a specific design. Puppeteer, being a browser automation tool, is excellent at preserving complex layouts and CSS. You can apply intricate CSS styles to your HTML before conversion, ensuring a PDF reflects the exact design intended. For example:
<!DOCTYPE html> <html> <head> <title>Sample PDF</title> <style> body { font-family: Arial, sans-serif; background-color: #f4f4f9; margin: 0; padding: 0; } .container { width: 80%; margin: 0 auto; padding: 20px; border: 1px solid #ddd; background-color: #fff; } h1 { color: #4caf50; font-size: 2em; } footer { text-align: center; font-size: 0.8em; color: #777; position: absolute; bottom: 20px; width: 100%; } </style> </head> <body> <div class="container"> <h1>Hello, World!</h1> <p>This PDF was generated from HTML.</p> </div> <footer> <p>© 2025 My Company. All rights reserved.</p> </footer> </body> </html>
This CSS will ensure that the generated PDF has the same appearance as the styled HTML page, preserving layout, typography, and other visual elements.
Step 3 — Writing the Puppeteer script
Create a new file, generatePdf.js
, to render the HTML to PDF:
const fs = require('fs'); const puppeteer = require('puppeteer'); async function generatePdf() { const browser = await puppeteer.launch(); const page = await browser.newPage(); const html = fs.readFileSync('template.html', 'utf8'); await page.setContent(html, { waitUntil: 'networkidle0' }); await page.pdf({ path: 'output.pdf', format: 'A4', printBackground: true, }); await browser.close(); console.log('PDF generated successfully'); } generatePdf();
Step 4 — Running the script
Generate the PDF by running the script:
node generatePdf.js
After running, a file named output.pdf
will appear in the project folder.
Common pitfalls and troubleshooting
When generating PDFs using Puppeteer, developers may encounter issues such as:
-
High memory usage — Optimize performance by using headless mode and closing unused pages.
-
Rendering delays — Use the
waitUntil
option to ensure all resources are loaded. -
Cross-platform issues — Ensure dependencies are compatible with the target operating system.
Advanced features and use cases for Node.js PDF generation
Dynamic content
When working with dynamic data, such as user-specific information or database entries, you can generate PDFs that reflect real-time data. For example, imagine generating invoices or personalized reports. This next section will cover how you can dynamically generate a PDF using Puppeteer.
Puppeteer dynamic content example
To create a dynamic PDF, you can modify the HTML content by injecting data fetched from a database, API, or user input. Below is an example of how to replace placeholders in an HTML template with dynamic values using Puppeteer.
-
HTML template (
template.html
):
<body>
<div class="container">
<h1>Hello, {{username}}!</h1>
<p>Your amount is: {{amount}}</p>
</div>
</body>
-
JavaScript code:
const puppeteer = require('puppeteer'); const fs = require('fs'); async function generateDynamicPdf(data) { const browser = await puppeteer.launch(); const page = await browser.newPage(); // Load the HTML template. let html = fs.readFileSync('template.html', 'utf8'); // Replace placeholders with dynamic data. html = html .replace('{{username}}', data.username) .replace('{{amount}}', data.amount); // Set the modified HTML content to the page. await page.setContent(html, { waitUntil: 'networkidle0' }); // Generate and save the PDF. await page.pdf({ path: 'dynamic_output.pdf', format: 'A4', printBackground: true, }); await browser.close(); console.log('Dynamic PDF generated successfully'); } // Example dynamic data. const userData = { username: 'John Doe', amount: '$1500' }; generateDynamicPdf(userData);
The HTML template (template.html
) contains placeholders such as {{username}}
and {{amount}}
, which will be replaced with dynamic data. The JavaScript code utilizes Puppeteer to launch a browser and create a new page. It then reads the HTML content from the template file, replacing the placeholders with actual dynamic values. After modifying the HTML, Puppeteer loads it into the page and generates a PDF, which is saved as dynamic_output.pdf
. This process allows you to create a customized PDF with content that can be dynamically injected based on user data or other sources.
Getting started with Nutrient Document Engine for PDF generation
While Puppeteer offers a simple solution for rendering PDFs from HTML, Nutrient provides a comprehensive PDF management tool that helps you unlock advanced features like editing, annotating, and digitally signing PDFs.
If you’re looking to take your PDF workflows even further with capabilities beyond basic generation, Nutrient’s Document Engine will give you the flexibility you need. The next section will walk you through how to get started with Nutrient and set it up for your Node.js project.
Requirements
Ensure your system meets the following requirements.
-
Operating systems:
-
macOS Ventura, Monterey, Mojave, Catalina, or Big Sur.
-
Ubuntu, Fedora, Debian, or CentOS (64-bit Intel and ARM processors supported).
-
-
Memory: At least 4 GB of RAM.
Installing Docker
Document Engine is distributed via Docker. Install Docker by following the appropriate instructions for your OS:
-
macOS — Install Docker Desktop for Mac.
-
Windows/Linux — Follow the guides on Docker’s website.
Setting up Document Engine
To start Document Engine, you’ll need to configure Docker. Save the following docker-compose.yml
file:
version: '3.8' services: document_engine: image: pspdfkit/document-engine:1.5.0 environment: PGUSER: de-user PGPASSWORD: password PGDATABASE: document-engine PGHOST: db PGPORT: 5432 API_AUTH_TOKEN: secret SECRET_KEY_BASE: secret-key-base JWT_PUBLIC_KEY: | -----BEGIN PUBLIC KEY----- MIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEA2gzhmJ9TDanEzWdP1WG+ 0Ecwbe7f3bv6e5UUpvcT5q68IQJKP47AQdBAnSlFVi4X9SaurbWoXdS6jpmPpk24 QvitzLNFphHdwjFBelTAOa6taZrSusoFvrtK9x5xsW4zzt/bkpUraNx82Z8MwLwr t6HlY7dgO9+xBAabj4t1d2t+0HS8O/ed3CB6T2lj6S8AbLDSEFc9ScO6Uc1XJlSo rgyJJSPCpNhSq3AubEZ1wMS1iEtgAzTPRDsQv50qWIbn634HLWxTP/UH6YNJBwzt 3O6q29kTtjXlMGXCvin37PyX4Jy1IiPFwJm45aWJGKSfVGMDojTJbuUtM+8P9Rrn AwIDAQAB -----END PUBLIC KEY----- JWT_ALGORITHM: RS256 DASHBOARD_USERNAME: dashboard DASHBOARD_PASSWORD: secret ports: - 5000:5000 depends_on: - db db: image: postgres:16 environment: POSTGRES_USER: de-user POSTGRES_PASSWORD: password POSTGRES_DB: document-engine volumes: - pgdata:/var/lib/postgresql/data volumes: pgdata:
Starting Document Engine
Open a terminal, navigate to the directory containing the docker-compose.yml
file, and run:
docker-compose up
Wait until you see the message:
document_engine_1 | Access the web dashboard at http://localhost:5000/dashboard
Visit http://localhost:5000/dashboard and authenticate using the following credentials:
-
Username:
dashboard
-
Password:
secret
PDF generation from HTML with Document Engine
Document Engine simplifies the process of generating PDFs directly from HTML. Here’s how to do it step by step.
Step 1 — Preparing your HTML template
First, create an HTML template for your content. Use Mustache to dynamically inject data into the template. Save this template as template.mustache
:
<!DOCTYPE html> <html> <body> <div class="address"> John Smith <br /> 123 Smith Street <br /> 90568 TA <br /> <br /> {{date}} </div> <div class="subject">Subject: PDF Generation FTW!</div> <div> <p>PDF is great!</p> </div> <div> {{name}} <br /> </div> </body> </html>
Step 2 — Providing the dynamic data
Create a data.json
file with the data that will replace the placeholders in your HTML template:
{ "name": "John Smith Jr.", "date": "29 February, 2020" }
Step 3 — Rendering HTML using Mustache
Next, render the HTML using Mustache:
const mustache = require('mustache'); const fs = require('fs'); const template = fs.readFileSync('template.mustache').toString(); const data = JSON.parse(fs.readFileSync('data.json').toString()); const outputHtml = mustache.render(template, data); // Save the rendered HTML file. fs.writeFileSync('output.html', outputHtml); console.log('HTML generated successfully.');
Step 4 — Sending HTML to the Document Engine
Instead of converting the HTML yourself, you can send it to the Document Engine’s API to generate the PDF. Here’s an example using axios
to send the rendered HTML:
const axios = require('axios'); const fs = require('fs'); // Read the generated HTML. const htmlContent = fs.readFileSync('output.html', 'utf8'); // Define the PDF generation schema. const pdfGenerationSchema = { html: htmlContent, layout: { orientation: 'portrait', // Optional: 'landscape' or 'portrait' size: 'A4', // Optional: 'A4', 'Letter', or custom dimensions. margin: { left: 10, // Optional: margin sizes in mm. top: 10, right: 10, bottom: 10, }, }, }; // Send the HTML to the Document Engine API. axios .post('http://localhost:5000/api/documents', pdfGenerationSchema, { headers: { Authorization: 'Token token=YOUR_API_TOKEN', 'Content-Type': 'application/json', }, }) .then((response) => { // Handle the PDF response (e.g. save the PDF file). fs.writeFileSync('output.pdf', response.data); console.log('PDF generated successfully.'); }) .catch((error) => { console.error('Error generating PDF:', error); });
Step 5 — Adding watermarks and cover pages
Document Engine can add extra features like watermarks and cover pages via its API. To add a watermark, include an additional HTML block like this:
<div
style="position: fixed;
top: 50%;
left: 50%;
font-size: 72px;
color: red;
opacity: 0.5;
transform: rotate(-45deg);
text-align: center;
z-index: -1;"
>
My Watermark
</div>
This will place a semi-transparent watermark in the center of the PDF.
For a cover page, you can add an additional HTML block with a page break:
<div style="page-break-after: always;">
<h1>Cover Page</h1>
<p>This is the cover page of the PDF.</p>
</div>
Alternatively, upload an existing PDF as the cover page through the Document Engine API:
curl -X POST http://localhost:5000/api/documents \ -H "Authorization: Token token=<API token>" \ -F page.html=@/path/to/page.html \ -F cover.pdf=@/path/to/cover.pdf \ -F generation='{ "html": "page.html" }' \ -F operations='{ "operations": [ { "type": "importDocument", "beforePageIndex": 0, "document": "cover.pdf" } ] }'
Alternative libraries for PDF generation
While Puppeteer and Nutrient Document Engine are great tools, there are several other libraries available for Node.js that might fit different use cases. Here’s a quick comparison of alternative libraries.
PDFKit
-
Overview — PDFKit is a powerful and feature-rich library for creating PDFs programmatically and supporting complex text layouts, vector graphics, and images.
-
Best for — Creating highly customizable PDFs, such as invoices, reports, and forms.
-
Key features
-
Vector graphics support (lines, rectangles, circles, etc.)
-
Text and image embedding
-
Complex layout capabilities
-
jsPDF
-
Overview — jsPDF is a lightweight library designed for client-side PDF generation, but it’s also usable server-side with Node.js.
-
Best for — Simple, small PDFs like reports, receipts, and charts.
-
Key features
-
Supports basic text, images, and shapes
-
Can create simple PDFs directly in the browser or server-side
-
Easy to use, with a small learning curve
-
Choosing the right library and handling complex PDF scenarios in Node.js
When generating PDFs in Node.js, selecting the right library depends on your specific needs:
-
For simple, fast PDFs — Use jsPDF for straightforward PDF generation.
-
For complex document layouts — PDFKit offers powerful features for intricate styling and formatting.
-
For high-quality rendering with complex styles — Puppeteer provides the most accurate web-to-PDF conversion, preserving advanced CSS and dynamic content.
-
For enterprise-level workflows — Nutrient Document Engine offers advanced document management and automation features.
Handling complex PDF generation scenarios
Generating PDFs from dynamic webpages and processing large datasets can present unique challenges. Here are key strategies and tools to address them:
-
Use a headless browser — Libraries like Puppeteer and Playwright enable PDF generation from JavaScript-heavy webpages, ensuring a PDF matches the original content’s appearance.
-
Optimize performance — Handling large datasets efficiently requires techniques like caching, parallel processing, and using efficient data structures to improve performance in real-time applications.
-
Use templates for dynamic content — Template engines such as Handlebars and EJS allow the separation of content and layout, making it easier to manage and customize PDF structures dynamically.
-
Customize layouts — Libraries like PDFKit and jsPDF provide advanced styling options, including custom fonts, images, and tables, ensuring tailored PDF designs to meet branding or formatting requirements.
-
Enterprise-grade automation with Nutrient Document Engine — Our SDK provides robust features for document workflows, including merging, form filling, annotation, and advanced security. It operates as a headless service within your infrastructure or can be hosted via Nutrient’s cloud.
-
Integrate with other tools — Combine Node.js libraries with external services like Nutrient Document Engine for API-based PDF automation, enabling template-based PDF generation and server-side rendering.
Example use cases
-
Generating PDFs from dynamic webpages — Use Puppeteer or Playwright to render JavaScript-heavy content into PDFs.
-
Handling large datasets — Implement caching and parallel processing techniques to optimize data handling.
-
Creating custom layouts — Utilize PDFKit and jsPDF for advanced designs, or leverage Nutrient Document Engine for extensive document lifecycle management.
-
Automating workflows — Document Engine provides a comprehensive solution for processing, annotating, and securing PDFs, making it ideal for enterprise applications.
By leveraging these strategies and tools, you can efficiently generate high-quality PDFs in Node.js applications while meeting specific functionality and performance needs.
Conclusion
This post covered how to generate PDFs from HTML in Node.js using Puppeteer for basic HTML-to-PDF tasks and Nutrient Document Engine for more advanced features like annotations and digital signatures. With Nutrient, you can set up Document Engine, render HTML with dynamic data, and create production-grade PDFs through API requests — ideal for generating documents such as reports and invoices. To get an API token, contact our Sales team.
FAQ
Below are some frequently asked questions about Document Engine and working with PDFs.
What are the system requirements for running Document Engine?
Document Engine requires at least 4 GB of RAM and can be run on macOS (Ventura, Monterey, Mojave, Catalina, or Big Sur) and Linux distributions (Ubuntu, Fedora, Debian, CentOS). Docker is required to run the engine.Do I need Puppeteer or any other tools to generate PDFs with Document Engine?
No, Document Engine handles the entire process of converting HTML to PDF, so you don’t need Puppeteer or any other tool to generate PDFs.How do I add dynamic content like names and dates to my PDFs?
You can use Mustache templates to inject dynamic data into your HTML before sending it to the Document Engine. This allows you to create personalized PDFs by simply updating the data file.Can I add watermarks or cover pages to my PDFs?
Yes, Document Engine lets you add watermarks and cover pages through its API. You can either add HTML for a watermark or upload a separate PDF file as a cover page.How do I interact with Document Engine using Node.js?
You can send an API request using Node.js and a library likeaxios
to pass your HTML content to Document Engine, which will return the generated PDF. The tutorial includes a sample script for making this request.