HTML-to-PDF Generation Schema

To achieve a flexible layout and custom page configuration, Document Engine exposes multiple configuration options all driven from a JSON object sent to the /api/documents endpoint when creating a document.

PDF Generation schema declaration

The following block outlines all the available options in the PDF Generation schema, which consists of input files, assets, and layout options:

type Orientation = "landscape" | "portrait";
type PageSize =
  | "A0"
  | "A1"
  | "A2"
  | "A3"
  | "A4"
  | "A5"
  | "A6"
  | "A7"
  | "A8"
  | "Letter"
  | "Legal";

type PdfGenerationSchema = {
  html: string, // The HTML file passed in the multipart request.
  assets?: Array<string>, // All assets imported in the HTML. Reference the name passed in the multipart request.
  layout?: {
    orientation?: Orientation,
    size?: {
      width: number,
      height: number
    } | PageSize, // {width, height} in mm or page size preset.
    margin?: {
      // Margin sizes in mm.
      left: number,
      top: number,
      right: number,
      bottom: number
    }
  }
};

html is mandatory, and if you provide this field only, it yields the minimum configurable operation. All the other fields will default to the following:

{
  "assets": [],
  "layout": {
    "orientation": "portrait",
    "size": "A4",
    "margin": {
      "left": 0,
      "top": 0,
      "right": 0,
      "bottom": 0
    }
  }
}

Referencing assets

When designing an HTML page, it’s common to split the design into multiple files, such as an HTML file, a CSS file, and image files. The PDF Generation command expects a flat directory structure, so any referenced assets have to reside next to the HTML file and not in subdirectories.

The following shows how you would send a CSS file that’s referenced in the HTML file:

<!DOCTYPE html>
<head>
  <link rel="stylesheet" href="style.css" />
</head>
<html>
  <body>
    <h1>PDF Generation Header</h1>
    <img src="my-image.jpg">
  </body>
</html>
h1 {
  font-size: xx-large;
}
curl -X POST http://localhost:5000/api/build \
  -F page.html=@/path/to/page.html \
  -F style.css=@/path/to/style.css \
  -F my-image.jpg=@/path/to/my-image.jpg \
  -F instructions='{
  "parts": [
    {
      "html": "page.html",
      "assets": [
        "style.css",
        "my-image.jpg"
      ]
    }
  ]
}' \
  --output result.pdf
POST /api/build HTTP/1.1
Content-Type: multipart/form-data; boundary=customboundary

--customboundary
Content-Disposition: form-data; name="instructions";
Content-Type: application/json

{
  "parts": [
    {
      "html": "page.html",
      "assets": [
        "style.css",
        "my-image.jpg"
      ]
    }
  ]
}
--customboundary
Content-Disposition: form-data; name="page.html"; filename="page.html";
Content-Type: text/html

<HTML data>
--customboundary
Content-Disposition: form-data; name="style.css"; filename="style.css";
Content-Type: text/css

<CSS data>
--customboundary--
Content-Disposition: form-data; name="my-image.jpg"; filename="my-image.jpg";
Content-Type: image/jpeg

<Image data>
--customboundary

Note that JavaScript assets currently aren’t supported in PDF Generation.

Assets passed in the multipart request must match the name used to reference the file in HTML. For example, if you have an image block, <img src="my-image.jpg">, the data representing the image in the multipart request should have the name my-image.jpg.

Page layout

The layout object, which is part of the PDF Generation schema, allows for customization of the PDF page layout and dimensions. All figures in this object are in reference to millimeters, and all pages will take on this configuration.