Extract Images from PDFs on iOS
This guide shows how to programmatically extract bitmap images from a PDF page.
Embedded images in a PDF page are represented by the ImageInfo
class and can be retrieved via the images
property on TextParser
. ImageInfo
also provides an assortment of image metadata properties, as well as methods to extract an image from a PDF as a UIImage
. To obtain the text parser for a given page, use the Document.textParserForPage(at:)
API.
The code below will grab all the bitmap images from the first page of the given PDF document and make them available for further processing as UIImage
instances:
// Update to use your document name and location. let fileURL = Bundle.main.url(forResource: "Document", withExtension: "pdf")! let document = Document(url: fileURL) guard let parser = document.textParserForPage(at: 0) else { print("Parsing failed.") return } let images: [UIImage] = imageInfos.compactMap { imageInfo in do { // Some PDF images are in the CMYK color space, which isn't a supported encoding. // Using this call converts all images to the RGB color space. return try imageInfo.imageInRGBColorSpace() } catch let error { print("Image processing failed. Error \(error.localizedDescription)") return nil } } // Do something with the images... print("Found \(images.count) images.")
The TextParser
API also offers access to the page text. To learn more about it, refer to the parsing and Text Extraction guides.