Getting started with AI Document Processing

This guide explains how to integrate AI Document Processing into your application.

What is AI Document Processing

AI Document Processing is an intelligent document processing (IDP) SDK that extends our existing key-value pair (KVP) technology with large language models (LLMs) to deliver best-in-class extraction and classification accuracy.

This new paradigm breaks traditional data extraction barriers to achieve a higher degree of accuracy compared to pure AI/ML alternatives.

Our solution offers the versatility of being used both as a REST microservice, suitable for hosting in any global region, and as an integrated API within desktop or server applications.

The solution operates without the need for storing documents or any extracted content, thereby greatly enhancing alignment with a wide array of data processing and retention policies.

Prerequisites

Before you follow the procedure in this guide, ensure you have:

  • Visual Studio. If you don’t have it, download and install it from Visual Studio Downloads.

  • A GdPicture.NET license key. If you don’t already have a GdPicture.NET license, contact our Sales team to request a trial.

  • An LLM provider API key. AI Document Processing currently supports OpenAI and Azure OpenAI (other LLMs will be supported soon).

Step 1: Obtaining an LLM provider API key (optional)

Skip this step if you already have an API key from one of the supported LLM providers.

Creating an OpenAI account

To create an OpenAI account, sign up to obtain an API key.

The OpenAI API has attained SOC 2 Type 2 compliance (see the official announcement).

Creating an Azure OpenAI account

To create an Azure OpenAI account, follow the quickstart guide. For information on how data provided to the Azure OpenAI service is processed, used, and stored, see the article on data, privacy, and security for Azure OpenAI Service.

Azure OpenAI Service can be used in a HIPAA-compliant manner.

Step 2: Installing AI Document Processing

The AI Document Processing SDK (formerly known as XtractFlow) is delivered as a NuGet package.

To incorporate the NuGet reference into your application:

  1. Right-click the project name in Solution Explorer and click Manage Nuget Packages….

  2. Enter XtractFlow in the search bar. In the search results, choose GdPicture.XtractFlow and click Install.

Step 3: Machine vision resources (optional)

Download additional resources, including OCR models or language packs that may be needed.

Extract the contents of the ZIP and reference the resource folder in your code using the following line:

Configuration.ResourceFolder = "<PATH_TO_OCR_RESOURCES>";

Next steps

After installation, take a look at our guides to see some examples of how to use AI Document Processing: