Extract Data from Bank Statements

This guide explains how to extract key-value pairs (KVPs) from bank statements using Document Engine. For example, this enables you to extract IBANs or account numbers. For more information, refer to the guide on how key-value pair extraction works.

Sending the Request to Extract Data

To extract key-value pairs from a bank statement, post a multipart request to the /api/build endpoint(opens in a new tab). In the instructions, specify the following output parameters:

  • type specifies the output type. Set this to json-content.
  • keyValuePairs is a Boolean value that determines whether to extract key-value pairs.
  • language specifies the language used for recognizing text with optical character recognition (OCR). Sometimes, text is stored in a PDF or an image in a way that makes it so you cannot search or copy it. PSPDFKit’s OCR engine allows you to recognize text and save it in a separate file where you can both search and copy and paste the text. For more information, refer to the list of supported languages.
Terminal window
curl -X POST http://localhost:5000/api/build \
-H "Authorization: Token token=<API token>" \
-F document=@/path/to/example-document.pdf \
-F instructions='{
"parts": [
{
"file": "document"
}
],
"output": {
"type": "json-content",
"keyValuePairs": true,
"language": "english"
}
}' \
-o result.pdf

For more information on the Build instructions, refer to the API Reference(opens in a new tab).

Example Data Extraction Response

{
"pages": [
{
"pageIndex": 0,
"keyValuePairs": [
{
"confidence": 95.4,
"key": {
"bbox": {
"left": 0,
"top": 0,
"width": 100,
"height": 100
},
"content": "IBAN"
},
"value": {
"bbox": {
"left": 0,
"top": 0,
"width": 100,
"height": 100
},
"content": "FR7611808009101234567890147",
"dataType": "String"
}
}
]
}
]
}