Supported Languages: 100+ OCR Language Dictionaries
PSPDFKit GdPicture.NET Library includes the following language dictionaries for recognizing text with OCR:
Language | Code |
---|---|
Arabic | ara |
German | deu |
English | eng |
French | fra |
Hebrew | heb |
Italian | ita |
Dutch, Flemish | nld |
Portuguese | por |
Spanish, Castilian | spa |
Vietnamese | vie |
To recognize languages not listed above, follow these steps:
-
Download the language files provided by the Tesseract team, which include more than 120 languages. To use previous language data files without long short-term memory (LSTM) engine use, download a previous release provided by the Tesseract team.
-
Add the language files to the folder where your OCR dictionaries are already installed. The default language resources are located in
GdPicture.NET 14\Redist\OCR
. -
Determine language names based on the language codes and the Tesseract documentation.