Extract invoice data with C# and OCR
Nutrient .NET SDK’s (formerly GdPicture.NET) key-value pair (KVP) extraction engine enables you to recognize related data items in a document and export them to an external destination like a spreadsheet.
To extract data items from an invoice, follow the steps below:
- Create a
GdPictureOCRobject and aGdPictureImagingobject. - Select the invoice by passing its path to the
CreateGdPictureImageFromFilemethod of theGdPictureImagingobject. - Configure the OCR process with the
GdPictureOCRobject in the following way:- Set the invoice with the
SetImagemethod. - Set the path to the OCR resource folder with the
ResourceFolderproperty. The default language resources are located inGdPicture.NET 14\Redist\OCR. For more information on adding language resources, see the language support guide. - With the
AddLanguagemethod, add the language resources that Nutrient .NET SDK uses to recognize text in the image. This method takes a member of theOCRLanguageenumeration.
- Set the invoice with the
- Run the OCR process with the
RunOCRmethod of theGdPictureOCRobject. - Get the number of key-value pairs detected during the OCR process with the
GetKeyValuePairCountmethod of theGdPictureOCRobject, and loop through them. - Get the key-value pairs, the data types, and the confidence scores with the following methods:
- Write the output to the console.
- Release unnecessary resources.
The example below retrieves key-value pairs from the following invoice.

Download the sample invoice and run the code below, or check out our demo.
=
using GdPictureOCR gdpictureOCR = new GdPictureOCR();using GdPictureImaging gdpictureImaging = new GdPictureImaging();// Load the source document.int imageId = gdpictureImaging.CreateGdPictureImageFromFile(@"C:\temp\source.png");// Configure the OCR process.gdpictureOCR.ResourceFolder = @"C:\GdPicture.NET 14\Redist\OCR";gdpictureOCR.AddLanguage(OCRLanguage.English);gdpictureOCR.SetImage(imageId);// Run the OCR process.string ocrResultId = gdpictureOCR.RunOCR();string keyValuePairsData = "";for (int pairIndex = 0; pairIndex < gdpictureOCR.GetKeyValuePairCount(ocrResultId); pairIndex++){ keyValuePairsData += $"| Key: {gdpictureOCR.GetKeyValuePairKeyString(ocrResultId, pairIndex)} | " + $"Value: {gdpictureOCR.GetKeyValuePairValueString(ocrResultId, pairIndex)} | " + $"Document Type: {gdpictureOCR.GetKeyValuePairDataType(ocrResultId, pairIndex).ToString()} | " + $"Confidence Level: {Math.Round(gdpictureOCR.GetKeyValuePairConfidence(ocrResultId, pairIndex), 1).ToString()}% |\n";}// Write the output to the console.Console.WriteLine(keyValuePairsData);// Release unnecessary resources.gdpictureImaging.ReleaseGdPictureImage(imageId);gdpictureOCR.ReleaseOCRResults();Using gdpictureOCR As GdPictureOCR = New GdPictureOCR()Using gdpictureImaging As GdPictureImaging = New GdPictureImaging() ' Load the source document. Dim imageId As Integer = gdpictureImaging.CreateGdPictureImageFromFile("C:\temp\source.png") ' Configure the OCR process. gdpictureOCR.ResourceFolder = "C:\GdPicture.NET 14\Redist\OCR" gdpictureOCR.AddLanguage(OCRLanguage.English) gdpictureOCR.SetImage(imageId) ' Run the OCR process. Dim ocrResultId As String = gdpictureOCR.RunOCR() Dim keyValuePairsData = "" For pairIndex As Integer = 0 To gdpictureOCR.GetKeyValuePairCount(ocrResultId) - 1 keyValuePairsData += $"| Key: {gdpictureOCR.GetKeyValuePairKeyString(ocrResultId, pairIndex)} | Value: {gdpictureOCR.GetKeyValuePairValueString(ocrResultId, pairIndex)} | Document Type: {gdpictureOCR.GetKeyValuePairDataType(ocrResultId, pairIndex).ToString()} | Confidence Level: {Math.Round(gdpictureOCR.GetKeyValuePairConfidence(CStr(ocrResultId), CInt(pairIndex)), CInt(1)).ToString()}% |" & vbLf Next ' Write the output to the console. Console.WriteLine(keyValuePairsData) ' Release unnecessary resources. gdpictureImaging.ReleaseGdPictureImage(imageId) gdpictureOCR.ReleaseOCRResults()End UsingEnd Using=
Used methods and properties
Related topics
Format the output to obtain the following table:
| Key | Value | Document type | Confidence level |
|---|---|---|---|
| Billing date | 20/09/2022 | DateTime | 100% |
| Order date | 20/09/2022 | DateTime | 100% |
| Republic of PDF | +100 847 738 227 | PhoneNumber | 77.2% |
| IBAN | AT13 2060 4236 6111 5994 | IBAN | 100% |
| Customer | Vandelay Industries Around the Corner 13 NBC City | String | 69.8% |
| Delivery address | Vandelay Industries Around the Corner 13 NBC City | String | 69.9% |
| Invoice number | No 00162 | String | 70.9% |
| Ref. number | 34751 | Number | 92.9% |
| No | 00162 | Number | 100% |
| Reference | P00201 | UID | 100% |
| Quantity total (excl. VAT) | 320.00€ | Currency | 59% |
| Subtotal | 1,220.00€ | Currency | 100% |
| Discount (10%) | -122.00€ | Currency | 90.6% |
| VAT (5.5%) | +6710€ | Currency | 66.9% |
| Shipping cost | 0.00€ | Currency | 75% |
| TOTAL | 1,165.10€ | Currency | 100% |
| Description | Lake Mirror | String | 99.6% |
| VAT | 5.5% | Percentage | 66.6% |
| Price per unit (excl. VAT) | 320.00€ | Currency | 80% |
| Tax No. | AT98765321 | UID | 73.8% |
| # | infe@bruuuk.com | EmailAddress | 65.6% |
| # | www.bruuuk.com | URL | 65.6% |