1.6 release notes
Before attempting to upgrade to Document Engine 1.6, first upgrade to Document Engine 1.5 if you haven’t already, and make sure your application still runs as expected. Read our general advice before doing so.
Highlights
This release introduces compatibility with the new versioning of Nutrient Web SDK (1.0.0
follows 2024.8.x
). Make sure to reread the client integration documentation, because Nutrient Web SDK renamed pspdfkit.js
to nutrient-viewer.js
and introduced NutrientViewer
instead of PSPDFKit
(the latter of which is still kept as an alias).
Additionally, Document Engine responds with the Server
header set to Document Engine/x.y.z
, where x.y.z
is the version of Document Engine. This change is subtle, but it enables you to identify problems more quickly (e.g. determining if a response is coming from Document Engine or a proxy like NGINX).
The HTML conversion engine no longer triggers unwanted outgoing network activity. There should be no traffic to optimizationguide-pa.googleapis.com
(port 443, UDP) and 239.255.255.250
(port 1900, UDP). We fixed the conversion engine configuration to make sure this doesn’t happen again.
TIFF conversion now supports 1-bit TIFF files with min-is-white
photometric interpretation and 0 DPI (e.g. barcodes). Version 1.6 also improves the handling of large TIFF files concerning memory allocations and CPU load.
Version 1.6 adds two new upstream API endpoints, allowing you to fetch the text of all pages in a document:
-
/api/documents/:document_id/pages/text
-
/api/documents/:document_id/layers/:layer_name/pages/text
Breaking changes
JSON Web Token (JWT) authorization requires you to provide a list of document identifiers the user is allowed to access in the JWT claims. If not specified, the user isn’t allowed to access any documents. Examples:
-
If not set, set to
null
, or an empty array is provided ({ "allowed_document_ids": [] }
) — No access allowed -
{ "allowed_document_ids": "any" }
— Access to any document is allowed -
{ "allowed_document_ids": ["foo", "bar"] }
— Access to documents withfoo
orbar
identifiers is allowed
Database migrations
Document Engine creates two sets of assets whenever it has to convert a document from a different format into a PDF on upload. These assets are:
-
The PDF result of the conversion, referred to as
source_pdf
in thedocuments
table. -
The original file of the conversion, referred to as
original_file
in thedocuments
table.
Historically, both source_pdf
and original_file
refer to assets in the pdfs
table using an encoded form of the assets’ sha256
hash. This release ships a change that lazily starts using uuid
to refer to original_file
assets in the documents
table instead of sha256
.
Changelog
A full list of changes, along with the issue numbers, is available here.