Convert PDF to Office

To convert a PDF document to Office format, send a multipart request to the /api/build API endpoint, including both the source document and the instructions JSON. In response, you’ll receive a ZIP archive containing the Office document.

Rendering a document requires you to provide the desired output format via a format option. Supported Office formats are DOCX, XLSX, and PPTX.

Only one format — DOCX, XLSX, or PPTX — can be chosen for the output document.

Before you get started, make sure Document Engine is up and running.

You’ll be sending multipart POST requests with instructions to Document Engine’s /api/build endpoint. To learn more about multipart requests, refer to our blog post on the topic, A Brief Tour of Multipart Requests.

Check out the API Reference to learn more about the /api/build endpoint and all the actions you can perform on PDFs with Document Engine.

Converting a PDF document from a local file to Office format

Send a multipart request to the /api/build endpoint, attaching an input file and the instructions JSON. Below is an example for converting a PDF document to Word (DOCX):

curl -X POST http://localhost:5000/api/build \
  -H "Authorization: Token token=<API token>" \
  -F document=@/path/to/example-document.pdf \
  -F instructions='{
  "parts": [
    {
      "file": "document"
    }
  ],
  "output": {
    "type": "office",
    "format": "docx",
    "pages": {
      "range": "1-5"
    }
  }
}' \
  -o result.zip
POST /api/build HTTP/1.1
Content-Type: multipart/form-data; boundary=customboundary
Authorization: Token token=<API token>

--customboundary
Content-Disposition: form-data; name="document"; filename="example-document.pdf"
Content-Type: application/pdf

<PDF data>
--customboundary
Content-Disposition: form-data; name="instructions"
Content-Type: application/json

{
  "parts": [
    {
      "file": "document"
    }
  ],
  "output": {
    "type": "office",
    "format": "docx",
    "pages": {
      "range": "1-5"
    }
  }
}
--customboundary--

Converting a PDF document from a URL to Office format

Send a multipart request to the /api/build endpoint, attaching a URL pointing to an input file and the instructions JSON. Below is an example for converting a PDF document to Excel (XLSX):

curl -X POST http://localhost:5000/api/build \
  -H "Authorization: Token token=<API token>" \
  -F instructions='{
  "parts": [
    {
      "file": {
        "url": "https://www.nutrient.io/downloads/examples/paper.pdf"
      }
    }
  ],
  "output": {
    "type": "office",
    "format": "xlsx",
    "pages": {
      "start": 1,
      "end": 1
    }
  }
}' \
  -o output-image.png
POST /api/build HTTP/1.1
Content-Type: multipart/form-data; boundary=customboundary
Authorization: Token token=<API token>

--customboundary
Content-Disposition: form-data; name="instructions"
Content-Type: application/json

{
  "parts": [
    {
      "file": {
        "url": "https://www.nutrient.io/downloads/examples/paper.pdf"
      }
    }
  ],
  "output": {
    "type": "office",
    "format": "xlsx",
    "pages": {
      "start": 1,
      "end": 1
    }
  }
}
--customboundary--