How to unlock PDF functionality with OCR in iOS

Table of contents

    How to unlock PDF functionality with OCR in iOS
    TL;DR
    • Use OCR to make scanned PDFs interactive and text computer-readable
    • Extract words, select text, highlight passages, and search documents
    • Integrate the OCR framework and add language bundles for recognition
    • Process documents with the Processor API and display results in PDFViewController

    With Nutrient iOS SDK 9.5, we introduced optical character recognition (OCR) for PDFs. OCR makes inaccessible text in a PDF — whether scanned or consisting of vector graphics — interactive and computer-readable.

    The following examples demonstrate how to add text-related functionality to documents with previously inaccessible text by combining OCR and PDF features.

    Use cases for OCR

    OCR exposes previously inaccessible text in scanned or photographed images converted to PDF. Access to the underlying text enables word extraction, text selection, passage highlighting, and phrase searching.

    Integrating OCR functionality

    To use OCR in an app, integrate the Nutrient OCR framework and add the appropriate language bundles. For complete instructions, see our OCR integration guide.

    Performing OCR

    Once the OCR framework is integrated, use the API to perform OCR on a document. The following code performs OCR on the first page, detects English text, saves the result to a new location, and displays it:

    guard let processorConfiguration = Processor.Configuration(document: document) else { return }
    processorConfiguration.performOCROnPages(at: IndexSet(integer: 0), options: ProcessorOCROptions(language: .english))
    let processor = Processor(configuration: processorConfiguration, securityOptions: nil)
    let ocrURL: URL = ... // File URL for OCRed document to be saved at.
    DispatchQueue.global(qos: .userInitiated).async {
    do {
    try processor.write(toFileURL: ocrURL)
    } catch {
    // Handle error.
    return
    }
    DispatchQueue.main.async {
    let ocrDocument = Document(url: ocrURL)
    pdfController.document = ocrDocument
    }
    }

    This creates a processor configuration from a document that should have OCR performed on it. Then we call the performOCROnPages(at: options:) method to actually mark the processor to perform the OCR action. Here we’ll need to provide an index set containing the pages that should be included in OCR, as well as the language the text should be recognized in.

    Furthermore, we create the processor with the configuration and set a URL where the output PDF should be saved. We then perform the write method. This saves the document to disk on a background thread since it can take a few seconds, and we don’t want to block the main thread.

    When the PDF has been created, we create the document from the URL the new file has been saved at, and we show the document on PDFViewController.

    For more details and examples, see our OCR usage guide.

    Working with text on the processed document

    After OCR processing, the document supports text-related functionality.

    Extraction

    Text can be extracted from a document after performing OCR. There are various APIs for getting a word from a document’s pages, the most important of which is the text parser.

    The text parser has the ability to get a page’s text in ways that are easy to work with. One way is by providing access to all the page’s words. Here’s an example of how to extract the first word of the first page:

    let textParser = document.textParserForPage(at: 0)!
    let word = textParser.words.first!

    We’re using the document’s text parser to first get the textual representation of the document, and then query the first object of the word’s property to get the first word on the page.

    Selection

    Text selection works through Nutrient’s UI or programmatically. All recognized text can be selected. To select the first word of the first page:

    let textParser = document.textParserForPage(at: 0)!
    let word = textParser.words.first!
    let pageView = pdfController.pageViewForPage(at: 0)!
    let selectionView = pageView.selectionView
    selectionView.selectedGlyphs = textParser.glyphs(in: word.range)

    We’re extracting the text of the first page to get the first word from it. Then we’re setting the selected glyphs property on the text selection view on the page view to mark a word as selected.

    Highlight

    Not only can we select text on the new document; we can also add highlights. Highlights are text markup annotations that will add a background to the selected text.

    The code below shows how to highlight a selected phrase:

    let highlightAnnotation = HighlightAnnotation.textOverlayAnnotation(with: selectionView.selectedGlyphs)!
    document.add(annotations: [highlightAnnotation])

    This will take the currently selected text, create a highlight annotation with the default style, and add it to the document.

    Using SearchViewController allows users of an app to search all text across a document. By default, the search UI can be accessed via the search button item that’s shown in the navigation bar of the PDF controller. This also works with text that has been recognized in a document that had OCR performed on it.

    Conclusion

    OCR combined with text-related PDF functionality enables working with documents — on iOS using Swift — where text was previously inaccessible.

    OCR is available across our product line on various platforms. See our OCR product page for the product that fits your needs.

    FAQ

    What is OCR and why use it with PDFs?

    Optical character recognition (OCR) converts images of text into computer-readable text. For PDFs containing scanned pages or vector graphics, OCR makes the text interactive — enabling selection, copying, highlighting, and searching.

    Which languages does Nutrient OCR support?

    Nutrient OCR supports multiple languages through language bundles. Add the appropriate bundle for each language your app needs to recognize. See the OCR integration guide for the full list.

    Does OCR modify the original PDF?

    No. The Processor API creates a new PDF with the recognized text layer. The original document remains unchanged. You specify the output URL where the processed document is saved.

    Can I perform OCR on specific pages only?

    Yes. The performOCROnPages(at:options:) method accepts an IndexSet specifying which pages to process. This allows you to OCR only the pages that need it.

    Stefan Kieleithner

    Stefan Kieleithner

    iOS Senior Software Engineer

    Stefan began his journey into iOS development in 2013 and has been passionate about it ever since. In his free time, he enjoys playing board and video games, spending time with his cats, and gardening on his balcony.

    Explore related topics

    Try for free Ready to get started?