OCR and data extraction
Give new life to scanned documents
Enable the seamless capture, processing, and integration of physical documents into your users’ digital workflows with robust scanning features.
Use cases
Create searchable PDFs from scanned documents
Capture and convert physical documents into digital formats using TWAIN and WIA scanning protocols, enabling easy integration into electronic workflows.
Extract data from PDFs
Set up table extraction and data parsing techniques to read semi-structured and unstructured data and convert from PDFs into structured formats like Excel.
Recognize text in multiple languages
Use OCR to accurately recognize and extract text in multiple languages, ensuring global compatibility and ease of processing diverse documents.
Automate data entry
Populate data fields in a database or other system by extracting structured tabular data from documents.
Detect barcodes, OMR, MICR, and MRZ data
Leverage powerful, built-in functions for scanning, decoding, and improving the quality of data across a variety of document types.
AI Document Processing
Attain human-level precision in data classification and extraction from various texts and image documents without set rules or coding.
components
How we help
Image Processing
Edit, print, and preprocess raster and vector images
Edit, print, and preprocess raster and vector images
Improve OCR, OMR, and barcode detection with advanced character and symbol recognition. Use 500+ powerful low-level functions to clean up, manipulate, and edit more than 100 document and image formats.
Knowledge center
Blog
Explore the latest insights, products, tutorials, and more.
Explore other use cases
Frequently asked questions
What features are available for OCR out of the box?
The following features and capabilities are available for OCR:
- Full Unicode support
- Multithread support
- Character recognition confidence
- Retrieving a character’s location
- Retrieving font information (e.g. style, family)
- Retrieving paragraph information (e.g. justification, alignment, bounding box)
- Direct conversion from an image or image-based PDF to PDF
- Extracting OCR results as text
- Getting the OCR result based on internal GdPicture structures serialized as a JSON string
- Recognizing only digits, only alpha, or results based on allowed or disallowed characters
- OCR context support (defines if the engine is processing a document, a single word, a single character, a text block, vertical text, etc.)
- Orientation detection
How easy is the OCR setup process?
Our code snippets, samples, and complete documentation provide a seamless and straightforward setup process.
Does Nutrient handle advanced PDF OCR features?
Yes. To learn more, visit our OCR guide.
Which languages are supported by OCR?
Nutrient Web SDK can perform OCR in the following languages: Croatian, Czech, Danish, Dutch, English, Finnish, French, German, Indonesian, Italian, Malay, Norwegian, Polish, Portuguese, Serbian, Slovak, Slovenian, Spanish, Swedish, Turkish, and Welsh. To learn more, visit our OCR guide.
Is there a trial version of Nutrient OCR?
Yes. You can begin a free trial by selecting the free trial option at the top of each page.
Latest from the blog
Blog