Extract Selected Text from PDFs Programmatically
PSPDFKit for Web comes with a reliable, cross-browser API to access text selection and retrieve both selected text and text lines located within a given selection range.
Each PSPDFKit instance can have only one selection at a time. The current text selection for a document has the shape of a PSPDFKit.TextSelection
object.
Getting Text Selection Programmatically
Text selection for the current PDF document can be retrieved at any time using the Instance#getTextSelection
method, which returns an instance of TextSelection
:
const textSelection = instance.getTextSelection(); console.log(textSelection instanceof PSPDFKit.TextSelection); // > true
const textSelection = instance.getTextSelection(); console.log(textSelection instanceof PSPDFKit.TextSelection); // > true
Since TextSelection
is an immutable object, its value won’t change over time — for example, when the current text is deselected or some other text is selected. In order to get the updated text selection, Instance#getTextSelection
must be invoked again.
Once a TextSelection
exists, the selected text can be retrieved with getText
. This method is asynchronous, so it returns a Promise
:
const textSelection = instance.getTextSelection(); const text = await textSelection.getText(); console.log(text);
const textSelection = instance.getTextSelection(); textSelection.getText().then(function (text) { console.log(text); });
TextSelection
objects expose two other methods:
-
getSelectedTextLines
, which returns theTextLine
s rendered within a given selection range. -
getBoundingClientRect
, which returns the bounding box of the current text selection in client coordinates.
Other useful properties, like selection’s startNode
and endNode
, are available on TextSelection
s. Please refer to the API docs for further details.
Subscribing to Change Events
Text selection updates are dispatched every time a text is selected or deselected. Subscribing to the textSelection.change
event ensures a notification when the selection changes:
instance.addEventListener("textSelection.change", textSelection => { if (textSelection) { textSelection.getText().then(text => { console.log(text); }); } else { console.log("no text is selected"); } });
instance.addEventListener("textSelection.change", function(textSelection) { if (textSelection) { textSelection.getText().then(function(text) { console.log(text); }); } else { console.log("no text is selected"); } });
Notice that when the text is deselected, the textSelection
argument is null
.