Extract Content from PDF File

Extract Text from PDF File

This step simply extracts all the text in a PDF file, Kingfisher is intelligent enough to detect image PDF pages and OCR before extracting any text from it. The only type of files we can’t extract meaningful text from by default are the ones with font encoding, we advise users to switch OCR for these file types.

Screen Field / Button Description
Start Page Page number of the page you want Kingfisher to start extracting text from.
End Page Page number of the page you want Kingfisher to stop extracting text from.

PDF to CSV/XLSX

This step is used to extract tabular data from PDF files, see section 5 for more details.

Advanced Export to CSV/XLSX

This step extracts text that appears before/after certain expressions, see section 5 for more details.