Extract pages from PDFs on Android
PdfProcessor
can export pages from one document into another document. You can choose to extract a single page, a range of pages, or even multiple page ranges:
// Page numbers start at 0. This range contains the fifth page of the document. val task = PdfProcessorTask.fromDocument(document).keepPages(setOf(4)) // Keep pages 5, 6, and 7. val task = PdfProcessorTask.fromDocument(document).keepPages(setOf(4, 5, 6)) // Remove the first page. val task = PdfProcessorTask.fromDocument(document).removePages(setOf(0))
// Page numbers start at 0. This range contains the fifth page of the document. PdfProcessorTask task = PdfProcessorTask.fromDocument(document).keepPages(new HashSet<Integer>(Arrays.asList(4)); // Keep pages 5, 6, and 7. PdfProcessorTask task = PdfProcessorTask.fromDocument(document).keepPages(new HashSet<Integer>(Arrays.asList(4, 5, 6)); // Remove the first page. PdfProcessorTask task = PdfProcessorTask.fromDocument(document).removePages(new HashSet<Integer>(Arrays.asList(0));
After creating PdfProcessorTask
, you can start the extraction of the pages by calling the PdfProcessor#processDocumentAsync
method or the PdfProcessor#processDocument
method. Note that by default, all annotations will be preserved. You can queue multiple operations on a document by calling multiple methods on a PdfProcessorTask
object before starting processing. The operations will be executed in the same order as your method calls:
val outputFile = File(getFilesDir(), "extracted-pages.pdf") // Keep pages 5, 6, and 7. val task = PdfProcessorTask.fromDocument(document).keepPages(setOf(4, 5, 6)) PdfProcessor.processDocumentAsync(task, outputFile) // Run processing on the background thread. .subscribeOn(Schedulers.io()) // Publish results on the main thread so we can update the UI. .observeOn(AndroidSchedulers.mainThread()) .subscribe( { progress: PdfProcessor.ProcessorProgress -> Toast.makeText(context, "Processing page ${progress.pagesProcessed}/${progress.totalPages}", Toast.LENGTH_SHORT).show() }, { error: Throwable -> Toast.makeText(context, "Processing has failed: ${error.message}", Toast.LENGTH_SHORT).show() }, { Toast.makeText(context, "Processing has been completed successfully.", Toast.LENGTH_SHORT).show() } )
final File outputFile = new File(getFilesDir(), "extracted-pages.pdf"); // Keep pages 5, 6, and 7. PdfProcessorTask task = PdfProcessorTask.fromDocument(document).keepPages(new HashSet<Integer>(Arrays.asList(4, 5, 6)); PdfProcessor.processDocumentAsync(task, outputFile) // Run processing on the background thread. .subscribeOn(Schedulers.io()) // Publish results on the main thread so we can update the UI. .observeOn(AndroidSchedulers.mainThread()) .subscribe(new DefaultSubscriber<PdfProcessor.ProcessorProgress>() { @Override public void onComplete() { Toast.makeText(context, "Processing has been completed successfully.", Toast.LENGTH_SHORT).show(); } @Override public void onError(Throwable e) { Toast.makeText(context, "Processing has failed:" + e.getMessage(), Toast.LENGTH_SHORT).show(); } @Override public void onNext(PdfProcessor.ProcessorProgress processorProgress) { Toast.makeText(context, "Processing page " + processorProgress.getPagesProcessed() + "/" + processorProgress.getTotalPages(), Toast.LENGTH_SHORT).show(); } });
💡 Tip: You can use page extraction to merge pages of two or more documents. All you need to do is load a compound PdfDocument
— for example, by using PSPDFKit#openDocuments
or any of the PdfActivity#showDocuments
methods. Have a look at DocumentProcessingExample
inside the Catalog app for a demo of this.