Indexing PDF documents on Android
Nutrient supports fast and efficient full-text search in PDF documents through PdfLibrary
. This document describes how to get started with PdfLibrary
.
Getting started
Using PdfLibrary
is relatively straightforward. You begin by indexing documents:
// Assume that you have two valid `PdfDocument`s. val doc1 : PdfDocument = ... val doc2 : PdfDocument = ... // The library will be saved in your application's files directory. val library = PdfLibrary.get(File(context.filesDir, "library.db").absolutePath) library.enqueueDocuments(listOf(doc1, doc2))
// Assume that you have two valid `PdfDocument`s. PdfDocument doc1, doc2; // The library will be saved in your application's files directory. PdfLibrary library = PdfLibrary.get(new File(context.getFilesDir(), "library.db").getAbsolutePath()); List<PdfDocument> documentList = new ArrayList<>(); documentList.add(doc1); documentList.add(doc2); library.enqueueDocuments(documentList);
PdfLibrary
allows you to query for the current indexing state.
You can decide to only query the library if all documents have been indexed by using isIndexing()
. You can also check the current status for individual documents by using getIndexStatusForUID()
.
The results are delivered to you with an onSearchCompleted
callback in QueryResultListener
. The results themselves are delivered as a Map
that maps the document’s UID String
to a set of page numbers containing the result.
If you wish to show preview snippets, you should enable the generateTextPreviews()
query option. Then the preview text snippets will be delivered to you in the onSearchPreviewsGenerated
method of QueryResultListener
as a Map
mapping the document’s UID String
to a set of QueryPreviewResult
objects.
Example:
// Set up search result options. val options = QueryOptions.Builder() .generateTextPreviews(true) .previewRange(20, 120) .build() // Run the search. The search will run on a background thread and the callbacks will be called // from the background thread as well. library.search("looking for this text", options, object : QueryResultListener { override fun onSearchCompleted(p0: String, p1: Map<String, Set<Int>>) { // Results contain UID → set of pages mapping. } override fun onSearchPreviewsGenerated(p0: String, p1: Map<String, Set<QueryPreviewResult>>) { // Previews contain UID → set of `QueryPreviewResult` mappings. } })
// Set up search result options. final QueryOptions options = new QueryOptions.Builder() .generateTextPreviews(true) .previewRange(20, 120) .build(); // Run the search. The search will run on a background thread and the callbacks will be called // from the background thread as well. library.search("looking for this text", options, new QueryResultListener() { @Override public void onSearchCompleted(@NonNull String searchString, @NonNull Map<String, Set<Integer>> results) { // Results contain UID → set of pages mapping. } @Override public void onSearchPreviewsGenerated(@NonNull String searchString, @NonNull Map<String, Set<QueryPreviewResult>> previews) { // Previews contain UID → set of `QueryPreviewResult` mappings. } });