How to unlock PDF functionality with OCR in iOS
Table of contents
- Use OCR to make scanned PDFs interactive and text computer-readable
- Extract words, select text, highlight passages, and search documents
- Integrate the OCR framework and add language bundles for recognition
- Process documents with the Processor API and display results in
PDFViewController
With Nutrient iOS SDK 9.5, we introduced optical character recognition (OCR) for PDFs. OCR makes inaccessible text in a PDF — whether scanned or consisting of vector graphics — interactive and computer-readable.
The following examples demonstrate how to add text-related functionality to documents with previously inaccessible text by combining OCR and PDF features.
Use cases for OCR
OCR exposes previously inaccessible text in scanned or photographed images converted to PDF. Access to the underlying text enables word extraction, text selection, passage highlighting, and phrase searching.
Integrating OCR functionality
To use OCR in an app, integrate the Nutrient OCR framework and add the appropriate language bundles. For complete instructions, see our OCR integration guide.
Performing OCR
Once the OCR framework is integrated, use the API to perform OCR on a document. The following code performs OCR on the first page, detects English text, saves the result to a new location, and displays it:
guard let processorConfiguration = Processor.Configuration(document: document) else { return }processorConfiguration.performOCROnPages(at: IndexSet(integer: 0), options: ProcessorOCROptions(language: .english))let processor = Processor(configuration: processorConfiguration, securityOptions: nil)let ocrURL: URL = ... // File URL for OCRed document to be saved at.
DispatchQueue.global(qos: .userInitiated).async { do { try processor.write(toFileURL: ocrURL) } catch { // Handle error. return } DispatchQueue.main.async { let ocrDocument = Document(url: ocrURL) pdfController.document = ocrDocument }}This creates a processor configuration from a document that should have OCR performed on it. Then we call the performOCROnPages(at: options:) method to actually mark the processor to perform the OCR action. Here we’ll need to provide an index set containing the pages that should be included in OCR, as well as the language the text should be recognized in.
Furthermore, we create the processor with the configuration and set a URL where the output PDF should be saved. We then perform the write method. This saves the document to disk on a background thread since it can take a few seconds, and we don’t want to block the main thread.
When the PDF has been created, we create the document from the URL the new file has been saved at, and we show the document on PDFViewController.
For more details and examples, see our OCR usage guide.
Working with text on the processed document
After OCR processing, the document supports text-related functionality.
Extraction
Text can be extracted from a document after performing OCR. There are various APIs for getting a word from a document’s pages, the most important of which is the text parser.
The text parser has the ability to get a page’s text in ways that are easy to work with. One way is by providing access to all the page’s words. Here’s an example of how to extract the first word of the first page:
let textParser = document.textParserForPage(at: 0)!let word = textParser.words.first!We’re using the document’s text parser to first get the textual representation of the document, and then query the first object of the word’s property to get the first word on the page.
Selection
Text selection works through Nutrient’s UI or programmatically. All recognized text can be selected. To select the first word of the first page:
let textParser = document.textParserForPage(at: 0)!let word = textParser.words.first!
let pageView = pdfController.pageViewForPage(at: 0)!let selectionView = pageView.selectionView
selectionView.selectedGlyphs = textParser.glyphs(in: word.range)We’re extracting the text of the first page to get the first word from it. Then we’re setting the selected glyphs property on the text selection view on the page view to mark a word as selected.
Highlight
Not only can we select text on the new document; we can also add highlights. Highlights are text markup annotations that will add a background to the selected text.
The code below shows how to highlight a selected phrase:
let highlightAnnotation = HighlightAnnotation.textOverlayAnnotation(with: selectionView.selectedGlyphs)!document.add(annotations: [highlightAnnotation])This will take the currently selected text, create a highlight annotation with the default style, and add it to the document.
Search
Using SearchViewController allows users of an app to search all text across a document. By default, the search UI can be accessed via the search button item that’s shown in the navigation bar of the PDF controller. This also works with text that has been recognized in a document that had OCR performed on it.
Conclusion
OCR combined with text-related PDF functionality enables working with documents — on iOS using Swift — where text was previously inaccessible.
OCR is available across our product line on various platforms. See our OCR product page for the product that fits your needs.
FAQ
Optical character recognition (OCR) converts images of text into computer-readable text. For PDFs containing scanned pages or vector graphics, OCR makes the text interactive — enabling selection, copying, highlighting, and searching.
Nutrient OCR supports multiple languages through language bundles. Add the appropriate bundle for each language your app needs to recognize. See the OCR integration guide for the full list.
No. The Processor API creates a new PDF with the recognized text layer. The original document remains unchanged. You specify the output URL where the processed document is saved.
Yes. The performOCROnPages(at:options:) method accepts an IndexSet specifying which pages to process. This allows you to OCR only the pages that need it.