Blog post

Java PDF Editor: How to Programmatically Edit PDFs Using Java

Illustration: Java PDF Editor: How to Programmatically Edit PDFs Using Java

In this post, you’ll learn how to use the PSPDFKit Java PDF editor library to manipulate the contents of PDF files. More specifically, you’ll learn how to create complex workflows to programmatically edit PDFs in Java, manipulate PDF files, and add or extract pages from PDF files using PSPDFKit.

This article will provide a detailed introduction to editing PDF files in Java and cover the most popular use cases, including how to:

  • Merge and combine PDFs

  • Rotate pages

  • Duplicate PDF pages

  • Move or rearrange pages

  • Remove pages from a PDF

  • Extract a page from a PDF

  • Add a PDF page

  • Split a PDF

  • Change the page label of a PDF

  • Import or export edits into a PDF

Additionally, you’ll learn how to create a new Java project using Maven in IntelliJ IDEA, as well as how to initialize the trial license of PSPDFKit.

Requirements to Get Started with the PSPDFKit Java PDF Editor

To get started, you’ll need the following:

Additionally, you can check out our guide on setting up Gradle/Maven dependencies for a Java project if you’re having any issues with them.

Creating a New Java Project Using Maven

First, you’ll create a new Java project in IntelliJ using the Maven quickstart archetype. To do this, launch IntelliJ IDEA and create a new project. Select Maven in the New Project dialog, and select the Create from archetype checkbox, as shown in the screenshot below.

Java PDF Editor Maven Quickstart

Click inside the list view and type quickstart. Select the archetype quickstart, as shown in the image below. Then click Next. Note that OpenJDK 18 is selected from the Project SDK dropdown above the list.

Java PDF Editor Maven Archetype

Enter a project name and path and click Next. On the last screen, select the group ID com.pspdf.

Click Finish.

Wait until Maven is done loading Java project dependencies. Now, add the Maven repository and dependency to the POM.xml file. Update the values of the maven.compiler.source and maven.compiler.target tags to 16 in the POM file. Note that a minimum of Java language level 8 is needed to use the PSPDFKit library.

Adding the PSPDFKit Repository and Dependency to a Maven Project

PSPDFKit is made available through a custom repository. Add it to the POM.xml file using the following code:

<repositories>
    <repository>
      <id>pspdfkit</id>
      <name>PSPDFKit Maven</name>
      <url>https://my.nutrient.io/maven/</url>
    </repository>
  </repositories>

After adding the repository, add the PSPDFKit dependency to your project using the XML below:

<dependency>
      <groupId>com.pspdfkit</groupId>
      <artifactId>libraries-java</artifactId>
      <version>1.4.1</version>
    </dependency>

The POM file will then appear as shown in the screenshot below. It can take a few minutes to resolve and download the library, based on your internet speed.

Maven Dependency

Here’s the content of the POM file:

<?xml version="1.0" encoding="UTF-8"?>

<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
 xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
 <modelVersion>4.0.0</modelVersion>

 <groupId>com.pspdf</groupId>
 <artifactId>EditPDFInJava</artifactId>
 <version>1.0-SNAPSHOT</version>

 <name>EditPDFInJava</name>

 <url>https://pspdfkit.com/</url>

 <properties>
   <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
   <maven.compiler.source>16</maven.compiler.source>
   <maven.compiler.target>16</maven.compiler.target>
 </properties>

 <repositories>
   <repository>
     <id>pspdfkit</id>
     <name>PSPDFKit Maven</name>
     <url>https://my.nutrient.io/maven/</url>
   </repository>
 </repositories>

 <dependencies>
   <dependency>
     <groupId>junit</groupId>
     <artifactId>junit</artifactId>
     <version>4.11</version>
     <scope>test</scope>
   </dependency>
   <dependency>
     <groupId>com.pspdfkit</groupId>
     <artifactId>libraries-java</artifactId>
     <version>1.4.1</version>
   </dependency>
 </dependencies>

 <build>
   <pluginManagement>
     <plugins>

       <plugin>
         <artifactId>maven-clean-plugin</artifactId>
         <version>3.1.0</version>
       </plugin>

       <plugin>
         <artifactId>maven-resources-plugin</artifactId>
         <version>3.0.2</version>
       </plugin>
       <plugin>
         <artifactId>maven-compiler-plugin</artifactId>
         <version>3.8.0</version>
       </plugin>
       <plugin>
         <artifactId>maven-surefire-plugin</artifactId>
         <version>2.22.1</version>
       </plugin>
       <plugin>
         <artifactId>maven-jar-plugin</artifactId>
         <version>3.0.2</version>
       </plugin>
       <plugin>
         <artifactId>maven-install-plugin</artifactId>
         <version>2.5.2</version>
       </plugin>
       <plugin>
         <artifactId>maven-deploy-plugin</artifactId>
         <version>2.8.2</version>
       </plugin>
       <plugin>
         <artifactId>maven-site-plugin</artifactId>
         <version>3.7.1</version>
       </plugin>
       <plugin>
         <artifactId>maven-project-info-reports-plugin</artifactId>
         <version>3.0.0</version>
       </plugin>
     </plugins>
   </pluginManagement>
 </build>
</project>

Initializing the PSPDFKit License

It’s possible to use the trial license of PSPDFKit to check how well the API performs. It takes a single method call to initiate the PSPDFKit trial.

Open the IntelliJ Project Explorer and expand the folders underneath to open the AppTest.java file.

Java PDF Editor PSPDFKit License

This file contains a JUnit test, which you’ll use for demo purposes.

Add a @BeforeClass tag and a public static method underneath it. When you run your demos, this method will be called before anything else in the code. You’ll use it to initialize the free trial license. If you have a license for the PSPDFKit PDF editing library, you can use that instead:

@BeforeClass
public static void initialSetup() {
	PSPDFKit.initializeTrial();
}

If the IDE complains about missing or unknown methods, click on the line where the error appears and press Alt-Enter to fix the problem.

If you have a paid license of PSPDFKit, use the following (be sure to replace "licenseKey" with your license key):

PSPDFKit.initialize("licenseKey");

The PSPDFKit.initialize\* method throws an exception of type PSPDFKitInitializeException. You can either handle or throw it; this post does the latter. The final form of the initialize method is:

@BeforeClass
public static void initialSetup() throws PSPDFKitInitializeException {
  PSPDFKit.initializeTrial();
}

Go to the Run menu and select the Run AppTest menu option to ensure everything with the PSPDFKit library setup is correct. The hotkey Shift-F10 (Shift-Function-F10 on Mac) can also be used for this purpose.

Setting Up PDF File Resources for Practice

Now, you’ll add some PDF files useful for following along with this example. You can download these assets in ZIP format here. After downloading and unzipping the package, open it and go to the Catalog subfolder. Copy the Assets folder and open the IntelliJ project.

Now, right-click the module icon and paste the folder. A pop-up dialog will show up, asking you to input the project name and directory. Click OK.

Adding PDF Files

Setting the Java Language Level above 8

Click the Sources tab and select Java language level 8 or above, since it’s necessary for using the PSPDFKit API. For the purposes of this post, use 16.

Setting the language Java PDF Editor

Next, click Project in the menu on the left and make sure the language level is set to 16 there as well.

Java PDF Editor Java Level Above 8

Click Apply and then OK to close the dialog.

Merging PDFs Using Java

Next, you’ll use the PSPDFKit Java API to combine multiple PDF files.

In this scenario, you’ll start with a file named personal-letter.pdf, to which you’ll add a cover page from another file, coverPage.pdf.

If you open the first file in a PDF viewer, it looks like what’s shown below.

Merge PDF Java Example Doc

Start by removing the default @Test method, shouldAnswerWithTrue(). Then add a new test method to the class:

@test
public void mergePDFs() {

    }

To open the PDF files, you’ll need to create an instance of the Java.IO.File class and pass it to the PSPDFKit method, PdfDocument.open, through a new instance of FileDataProvider, as shown below:

FileDataProvider providerPersonalLetter = new FileDataProvider(
            new File("Assets/personal-letter.pdf"));
    PdfDocument personalLetter = PdfDocument.open(providerPersonalLetter);

    FileDataProvider providerCoverPage = new FileDataProvider(
            new File("Assets/coverPage.pdf"));
    PdfDocument cover = PdfDocument.open(providerCoverPage);
    DocumentEditor letterEditor = personalLetter.createDocumentEditor();

IntelliJ will show errors underneath the PdfDocument and File classes. You can resolve these errors by clicking the error location and pressing Alt-Enter. IntelliJ will then automatically add the requisite import statements to the file.

Now you have an object, personalLetter, which represents the personal-letter.pdf file. You can add pages to this PDF document by retrieving an instance of the DocumentEditor class using the following method call:

DocumentEditor letterEditor = personalLetter.createDocumentEditor();

Resolve the DocumentEditor type by clicking it and pressing Control-Enter (Command-Return on Mac).

The last step is to save your newly minted PDF file to disk. You’ll need an instance of the DocumentEditor class (which you already have), and an instance of the java.io.File class (which you’ll create):

try {
        File file = File.createTempFile("personal-letter-with-cover", ".pdf");
        System.out.println(file.getAbsolutePath());
        letterEditor.saveDocument(new FileDataProvider(file));
    }catch (IOException ioex)
    {
        System.out.println("Unable to save file. " + ioex.toString());
     }

The code needed to write a PDF file to the storage is surrounded by a try-catch block because it can throw an I/O exception.

Next, expand the Maven menu on the left and click the Lifecycle tab. This will show a dropdown menu with a list of goals. Select the clean goal and right-click on it to run it. This will remove all the files generated by the previous build.

Then, do the same with the test tag in the same dropdown menu. As an alternative, click the Run menu and select Run App Test. The test will run, and the path of the merged PDF file will be shown in the Run log.

Merge PDF Java Test

Copy the path and open the file to see the result. The new combined PDF file will look like what’s shown below.

Merge PDF Java Final Result

In addition, the original PDF files will remain unchanged.

Here’s the complete code sample:

@Test
public void mergePDFs()
{
    FileDataProvider providerPersonalLetter = new FileDataProvider(
            new File("Assets/personal-letter.pdf"));
    PdfDocument personalLetter = PdfDocument.open(providerPersonalLetter);

    FileDataProvider providerCoverPage = new FileDataProvider(
            new File("Assets/coverPage.pdf"));
    PdfDocument cover = PdfDocument.open(providerCoverPage);

    DocumentEditor letterEditor = personalLetter.createDocumentEditor();
    letterEditor.importDocument(0, DocumentEditor.IndexPosition.BeforeIndex, providerCoverPage);

    try {
        File file = File.createTempFile("personal-letter-with-cover", ".pdf");
        System.out.println(file.getAbsolutePath());
        letterEditor.saveDocument(new FileDataProvider(file));
    }catch (IOException ioex)
    {
        System.out.println("Unable to save file. " + ioex.toString());
    }
}

Rotating PDF Pages Using Java

To rotate the pages of a PDF, you’ll use the FileDataProvider, PdfDocument, and DocumentEditor objects.

First, open a PDF file from the assets included in the project (coverPage.pdf). Create a new JUnit test method, rotatePDF, and add the code to open the source PDF. Then, create an instance of DocumentEditor to edit it in memory:

@Test
public void rotatePDF()
{
    FileDataProvider providerPersonalLetter = new FileDataProvider(
            new File("Assets/coverPage.pdf"));
    PdfDocument personalLetter = PdfDocument.open(providerPersonalLetter);
    DocumentEditor letterEditor = personalLetter.createDocumentEditor();
}

It’s possible to rotate individual pages of a PDF file with PSPDFKit using the following values:

  • 0 degrees

  • 90 degrees

  • 180 degrees

  • 270 degrees

The rotation degrees are represented by an enumeration called Rotation. The definition of this enumeration is given below:

public enum Rotation {
    Degrees0,
    Degrees90,
    Degrees180,
    Degrees270;

private Rotation() {
    }
}

The pages that need to be rotated are specified by an object of Set<Integer>. Since the example PDF file only has one page, define the set as follows:

Set<Integer> setOfPagesToRotate = new HashSet<>(Arrays.asList(0));

The page index starts from 0, just like array indexing. So, if you want to rotate the first page of a document, add the number 0 to your set.

After this, call the rotatePages method:

letterEditor.rotatePages(setOfPagesToRotate, Rotation.Degrees90);

Now, save the file to the disk. Create a private method to do it again:

private void savePdfFileToDisk(String fileName, DocumentEditor editor) {
    // Export the document to a file path.
    try {
        File file = File.createTempFile(fileName, ".pdf");
        System.out.println(file.getAbsolutePath());
        editor.saveDocument(new FileDataProvider(file));
    } catch (IOException e) {
        e.printStackTrace();
    }
}

Now, call the method:

savePdfFileToDisk("rotatedLetter", letterEditor);

Right-click anywhere inside your test method and select Run rotatePDF from the context menu. If everything goes according to plan, the test definition will have a green checkmark next to it.

Rotate PDF Page Java Test

The path of the file generated by the PSPDFKit file after rotating the pages is printed in the console. You can copy it and open the file to see the generated PDF.

You can see the before and after versions of the PDF below.

Rotate PDF Page Java Final Result

Here’s the complete code snippet used to rotate the PDF pages:

@Test
public void rotatePDF()
{
    FileDataProvider providerPersonalLetter = new FileDataProvider(
             new File("Assets/coverPage.pdf"));
    PdfDocument personalLetter = PdfDocument.open(providerPersonalLetter);

    DocumentEditor letterEditor = personalLetter.createDocumentEditor();

    Set<Integer> setOfPagesToRotate = new HashSet<Integer>(Arrays.asList(0));

    letterEditor.rotatePages(setOfPagesToRotate, Rotation.Degrees90);

    savePdfFileToDisk("rotatedLetter", letterEditor);
    }

Duplicating PDF Pages Using Java

The duplication of PDF pages is achieved using DocumentEditor, and the process is more or less the same as the page rotation process in the section above. The only difference is that a different PSPDFKit API method is called.

First, open a PDF file from the assets included in the project (playground.pdf). Create a new JUnit test method called duplicatePDFPages. Then, add the code to open the source PDF, and create an instance of DocumentEditor to edit it:

@Test
public void duplicatePDFPages ()
    {
    FileDataProvider providerPDFFile = new FileDataProvider(
            new File("Assets/playground.pdf"));
    PdfDocument playGround = PdfDocument.open(providerPDFFile);

    DocumentEditor pdfEditor = playGround.createDocumentEditor();

}

Now, create a set of integers to represent the pages you want to duplicate:

Set<Integer> setOfPagesToDuplicate = new HashSet<>(Arrays.asList(0, 1, 2));

Finally, call the DocumentEditor.duplicatePages method:

pdfEditor.duplicatePages(setOfPagesToDuplicate);

And save to disk:

savePdfFileToDisk("rotatedLetter", pdfEditor);

Right-click anywhere inside the duplicatePDFPages method and click Run duplicatePDFPages. The test will run and pass, and the path of the newly generated file will be printed in the console. You can copy it and open it to see the results.

You can see the before and after versions of the PDF below.

Duplicate PDF Page Java Final Result

Here’s the complete code snippet used to duplicate PDF pages:

@Test
public void duplicatePDFPages()
{
    FileDataProvider providerPDFFile = new FileDataProvider(
            new File("Assets/playground.pdf"));
    PdfDocument playGround = PdfDocument.open(providerPDFFile);

    DocumentEditor pdfEditor = playGround.createDocumentEditor();

    Set<Integer> setOfPagesToDuplicate = new HashSet<Integer>(Arrays.asList(0, 1, 2));

    pdfEditor.duplicatePages(setOfPagesToDuplicate);

    savePdfFileToDisk("fileWithDuplicatePages", pdfEditor);
}

Moving or Rearranging PDF Pages in Java

It’s possible to change the sequence of pages inside a PDF file.

First, open a PDF file from the assets included in the project (playground.pdf). Create a new test called movePDFPages:

@Test
public void movePDFPages()
{
    FileDataProvider providerPDFFile = new FileDataProvider(
            new File("Assets/playground.pdf"));
    PdfDocument playGround = PdfDocument.open(providerPDFFile);

    DocumentEditor pdfEditor = playGround.createDocumentEditor();
}

Again, specify the index of the pages you want to move using an object of Set<Integer>. For the purpose of this example, you’ll move the first page of the PDF document to the end. This means taking the page at index 0 and moving it to one position less than the total number of pages.

Define the set first:

Set<Integer> setOfPagesToMove = new HashSet<>(Arrays.asList(0));

If you want to know the number of pages inside a PDF file, you can use the PdfDocument.getPageCount method as a second parameter to the call to the DocumentEditor.movePages method of PSPDFKit.

Make the call to move the PDF pages:

pdfEditor.movePages(setOfPagesToMove, Math.toIntExact(playGround.getPageCount() - 1), DocumentEditor.IndexPosition.AfterIndex);

The second parameter of the movePages method takes an int, but the getPageCount method call returns a long data type. The IndexPosition enumeration is used to specify whether you want to move a page of a PDF file before the target index or after the target index. It has two possible values:

  • BeforeIndex

  • AfterIndex

Now you can call the savePdfFileToDisk method:

savePdfFileToDisk("moveFirstPDFPageToLast", pdfEditor);

Run the test to see the results.

A side-by-side comparison of the original file and the file produced by PSPDFKit shows that the first page of the PDF has indeed been moved to the last location in the PDF file.

Move PDF Pages Java Final Result

Here’s the complete code snippet:

@Test
public void movePDFPages()
{
    FileDataProvider providerPDFFile = new FileDataProvider(
            new File("Assets/playground.pdf"));
    PdfDocument playGround = PdfDocument.open(providerPDFFile);
    DocumentEditor pdfEditor = playGround.createDocumentEditor();

    // Move the first page to the last.
    Set<Integer> setOfPagesToMove = new HashSet<Integer>(Arrays.asList(0));
    pdfEditor.movePages(setOfPagesToMove, Math.toIntExact(playGround.getPageCount() - 1), DocumentEditor.IndexPosition.AfterIndex);

    savePdfFileToDisk("moveFirstPDFPageToLast", pdfEditor);
}

Here’s the code for moving the last page to the first position:

// Move the first page to the last.
    setOfPagesToMove = new HashSet<Integer>(
            Arrays.asList(Math.toIntExact(playGround.getPageCount() - 1)));
    pdfEditor = playGround.createDocumentEditor();
    pdfEditor.movePages(setOfPagesToMove, 0, DocumentEditor.IndexPosition.BeforeIndex);
    savePdfFileToDisk("moveLastPDFPageToFirst", pdfEditor);

Because you added this code to your existing test method, you need to assign new values to setOfPagesToMove and pdfEditor.

Here’s how to move the first page of a PDF file to the middle of the file:

// Move the first page to the middle.
    setOfPagesToMove = new HashSet<Integer>(
            Arrays.asList(Math.toIntExact(0)));
    pdfEditor = playGround.createDocumentEditor();
    pdfEditor.movePages(setOfPagesToMove,
            (Math.toIntExact(playGround.getPageCount()) / 2) -1
            , DocumentEditor.IndexPosition.AfterIndex
            );
    savePdfFileToDisk("movefirstPDFPageToMiddle", pdfEditor);

Here’s the complete code of the movePDFPages test method:

@Test
public void movePDFPages()
{
    FileDataProvider providerPDFFile = new FileDataProvider(
            new File("Assets/playground.pdf"));
    PdfDocument playGround = PdfDocument.open(providerPDFFile);
    DocumentEditor pdfEditor = playGround.createDocumentEditor();

        // Move the first page to the last.
    Set<Integer> setOfPagesToMove = new HashSet<Integer>(Arrays.asList(0));
    pdfEditor.movePages(setOfPagesToMove, Math.toIntExact(playGround.getPageCount() - 1), DocumentEditor.IndexPosition.AfterIndex);
    savePdfFileToDisk("moveFirstPDFPageToLast", pdfEditor);

    // Move the first page to the last.
    setOfPagesToMove = new HashSet<Integer>(
            Arrays.asList(Math.toIntExact(playGround.getPageCount() - 1)));
    pdfEditor = playGround.createDocumentEditor();
    pdfEditor.movePages(setOfPagesToMove, 0, DocumentEditor.IndexPosition.BeforeIndex);
    savePdfFileToDisk("moveLastPDFPageToFirst", pdfEditor);

    // Move the first page to the middle.
    setOfPagesToMove = new HashSet<Integer>(
            Arrays.asList(Math.toIntExact(0)));
    pdfEditor = playGround.createDocumentEditor();
    pdfEditor.movePages(setOfPagesToMove,
            (Math.toIntExact(playGround.getPageCount()) / 2) -1
            , DocumentEditor.IndexPosition.AfterIndex
            );
    savePdfFileToDisk("movefirstPDFPageToMiddle", pdfEditor);
}

Extracting Pages from a PDF Using Java

Extracting a new PDF from an existing PDF can be accomplished using the removePages method. The process will be almost the same as duplicating PDF pages.

First, open a PDF file from the assets included in the project (playground.pdf). Create a new JUnit test method, extractPDF, and add the code to open the PDF. Then, create an instance of DocumentEditor to edit it. This method is known as a subtractive approach:

@Test
public void extractPDFSubtractive()
{
    FileDataProvider providerPDFFile = new FileDataProvider(
            new File("Assets/playground.pdf"));
    PdfDocument playGround = PdfDocument.open(providerPDFFile);

    DocumentEditor pdfEditor = playGround.createDocumentEditor();

    }

After this, create a Set<Integer> to specify the pages you want to remove from the PDF. The remaining pages will form your extracted PDF:

Set<Integer> setOfPagesToRemoveFromExtractedPDF = new HashSet<Integer>(Arrays.asList(1, 3, 5));

This set can be used to remove pages 2, 4, and 6 from your PDF.

Pass this set to the DocumentEditor.removePages method and save the PDF to disk:

pdfEditor.removePages(setOfPagesToRemoveFromExtractedPDF);
savePdfFileToDisk("extractPDF", pdfEditor);

The image below shows the before and after versions of the file.

Extracting a PDF Page Java Final Result

Here’s the complete code snippet for removing pages from a PDF:

@Test
public void extractPDFSubtractive()
{
    FileDataProvider providerPDFFile = new FileDataProvider(
            new File("Assets/playground.pdf"));
    PdfDocument playGround = PdfDocument.open(providerPDFFile);

    DocumentEditor pdfEditor = playGround.createDocumentEditor();

    Set<Integer> setOfPagesToRemoveFromExtractedPDF = new HashSet<Integer>(Arrays.asList(1, 3, 5));

    pdfEditor.removePages(setOfPagesToRemoveFromExtractedPDF);

    savePdfFileToDisk("extractPDF", pdfEditor);
    }

Adding a Page to an Existing PDF File Using Java

It’s possible to add new pages to an existing PDF file using the DocumentEditor.addPage method.

First, open a PDF file from the assets included in the project (playground.pdf). Create a new test and add the PDF object’s initialization code to it:

@Test
public void addAPageToAPDF()
{
  FileDataProvider providerPDFFile = new FileDataProvider(
          new File("Assets/playground.pdf"));
  PdfDocument playGround = PdfDocument.open(providerPDFFile);
  DocumentEditor pdfEditor = playGround.createDocumentEditor();
}

Then, define the insets for use inside the PDF file:

Insets insets = new Insets(0, 0, 0, 0);

Call the addPage method on the DocumentEditor object and provide it with the following:

  • Page index where a new page will be added to the PDF

  • Information to add the page before or after the index specified

  • Height and width of the page

  • Rotation of the page in degrees

  • Color of the page

  • Inset to be used inside the page

The code looks like this:

pdfEditor.addPage(0, DocumentEditor.IndexPosition.BeforeIndex,
            200, 200, Rotation.Degrees90, Color.red, insets);

After, save the file to the disk:

savePdfFileToDisk("pageAddedToPDF", pdfEditor);

Here’s the full code:

@Test
public void addAPageToAPDF()
{
    FileDataProvider providerPDFFile = new FileDataProvider(
            new File("Assets/playground.pdf"));
    PdfDocument playGround = PdfDocument.open(providerPDFFile);
    DocumentEditor pdfEditor = playGround.createDocumentEditor();

    Insets insets = new Insets(0, 0, 0, 0);

    pdfEditor.addPage(0, DocumentEditor.IndexPosition.BeforeIndex,
            200, 200, Rotation.Degrees90, Color.red, insets);

     savePdfFileToDisk("pageAddedToPDF", pdfEditor);

    }

Note that you can add a new page anywhere inside an existing PDF file using PSPDFKit API. It’s also possible to add content to the newly added page using the annotations functionality.

Splitting a PDF File Using Java

To split a PDF, create two DocumentEditor objects based on the same file. For the first object, specify which pages to keep, and for the second, specify the same pages to remove. In the end, you’ll save both files to the disk, and you’ll end up with two PDF files “split” from a single PDF.

First, open a PDF file from the assets included in the project (playground.pdf). Define the splitPDFFile method and the set of pages you want to use as the basis of the split operation:

@Test
public void splitPDFFile()
{
     FileDataProvider providerPDFFile = new FileDataProvider(
            new File("Assets/playground.pdf"));
    PdfDocument playGround = PdfDocument.open(providerPDFFile);
    DocumentEditor pdfEditorFirst = playGround.createDocumentEditor();
    DocumentEditor pdfEditorSecond = playGround.createDocumentEditor();
    Set<Integer> setOfPagesForSplit = new HashSet<Integer>(Arrays.asList(0, 1 , 2, 3));
}

Next, call the keepPages method on one DocumentEditor, and call removePages on the other. Then save the results in two different files:

pdfEditorFirst.keepPages(setOfPagesForSplit);
    pdfEditorSecond.removePages(setOfPagesForSplit);

    savePdfFileToDisk("bananaSplitPDFFirst", pdfEditorFirst);
    savePdfFileToDisk("bananaSplitPDFSecond", pdfEditorSecond);

Split PDF Java Final Result

The original PDF file is displayed in the bottom half of the screen. The first split PDF file is shown in the top-left corner, and the second split PDF file is shown in the top-right corner.

Here’s the full Java code snippet:

@Test
public void splitPDFFile()
{
     FileDataProvider providerPDFFile = new FileDataProvider(
             new File("Assets/playground.pdf"));
     PdfDocument playGround = PdfDocument.open(providerPDFFile);
     DocumentEditor pdfEditorFirst = playGround.createDocumentEditor();
     DocumentEditor pdfEditorSecond = playGround.createDocumentEditor();
     Set<Integer> setOfPagesForSplit = new HashSet<Integer>(Arrays.asList(0, 1 , 2, 3));

    pdfEditorFirst.keepPages(setOfPagesForSplit);
    pdfEditorSecond.removePages(setOfPagesForSplit);

    savePdfFileToDisk("bananaSplitPDFFirst", pdfEditorFirst);
    savePdfFileToDisk("bananaSplitPDFSecond", pdfEditorSecond);
}

Changing a PDF Page Label

PDFs have a unique feature called page labels, and they allow you to label specific pages of a PDF independent of bookmarks and file names. Labeling pages is a way to number or add arbitrary names to non-sequential pages.

By default, the page numbers are considered page labels. But it’s also possible to change the values of the page labels to something else. As an example, maybe the first few pages of a PDF file need to be labeled differently from the rest of the PDF.

To do this, first open a PDF file from the assets included in the project (playground.pdf). Then, define the changePageLabels test method:

@Test
public void changePageLabels() {
    FileDataProvider providerPDFFile = new FileDataProvider(
            new File("Assets/playground.pdf"));
     PdfDocument playGround = PdfDocument.open(providerPDFFile);
     DocumentEditor pdfEditor = playGround.createDocumentEditor();
}

Set the label value to the capital letter A for the first page of a PDF:

Set<Integer> setOfPagesForChangingLabel = new HashSet<Integer>(Arrays.asList(0));

    pdfEditor.setPageLabel(setOfPagesForChangingLabel, "A");

Use the following code to set the values B, C, and D for page numbers 2, 3, and 4 of the PDF:

setOfPagesForChangingLabel = new HashSet<Integer>(Arrays.asList(1));

    pdfEditor.setPageLabel(setOfPagesForChangingLabel, "B");

    setOfPagesForChangingLabel = new HashSet<Integer>(Arrays.asList(2));

    pdfEditor.setPageLabel(setOfPagesForChangingLabel, "C");

    setOfPagesForChangingLabel = new HashSet<Integer>(Arrays.asList(3));

    pdfEditor.setPageLabel(setOfPagesForChangingLabel, "D");

Now, save the results to disk:

savePdfFileToDisk("pdfWithUpdatedLabels", pdfEditor);

If you run the test script, you’ll be able to see that the processed PDF file will have page labels starting with capital letters.

Change PDF Label Final Result

Here’s the full code snippet:

@Test
 public void changePageLabels() {
    FileDataProvider providerPDFFile = new FileDataProvider(
            new File("Assets/playground.pdf"));
    PdfDocument playGround = PdfDocument.open(providerPDFFile);
    DocumentEditor pdfEditor = playGround.createDocumentEditor();

    Set<Integer> setOfPagesForChangingLabel = new HashSet<Integer>(Arrays.asList(0));
    pdfEditor.setPageLabel(setOfPagesForChangingLabel, "A");

    setOfPagesForChangingLabel = new HashSet<Integer>(Arrays.asList(1));
    pdfEditor.setPageLabel(setOfPagesForChangingLabel, "B");

    setOfPagesForChangingLabel = new HashSet<Integer>(Arrays.asList(2));
    pdfEditor.setPageLabel(setOfPagesForChangingLabel, "C");

    setOfPagesForChangingLabel = new HashSet<Integer>(Arrays.asList(3));
    pdfEditor.setPageLabel(setOfPagesForChangingLabel, "D");

    savePdfFileToDisk("pdfWithUpdatedLabels", pdfEditor);
}

Creating a Complex Java PDF Editor Workflow Using JSON

So far, you’ve seen how to perform individual operations on a PDF file using the PSPDFKit Java PDF editor library. More specifically, you learned how to merge PDF files, rotate PDF pages, duplicate pages of a PDF, move pages of a PDF to rearrange them, extract pages from a PDF, remove pages from a PDF, and add new pages to an existing PDF.

You called different methods of the PSPDFKit API in your code to perform these operations. But it’s also possible to specify your operations in JSON format, along with their parameters, and PSPDFKit API will call them for you. The source of the JSON can be an API endpoint, a file, a database, or anything else under the sun. This is done using the PSPDFKit DocumentEditor.addOperations(JSONArray) method.

Each operation is stored as a JSON object, which is then placed inside a JSON array and supplied to the addOperations method call. An example of such a chain of operations would be to remove a page from a PDF, rotate it, and set a new page label, as shown in the image below.

Java PDF Editor Complex Workflow Example

This example contains a JSON string that specifies a type of operation to be performed on a PDF file and relevant parameters. In the case of the removePages operation type, you only need to specify the page indices that need to be removed. The page indices are zero-based.

For rotatePages, specify on which pages of the PDF this operation will be performed, along with the amount of rotation to be applied on each page.

And for setPageLabel, specify the page numbers on which the operation will be performed and the label that will be set for those pages.

Begin by opening a PDF file from the assets included in the project (playground.pdf). Create a test method, chainOfOperationsOnPDF, for this purpose:

@Test
public void chainOfOperationsOnPDF()
{
    FileDataProvider providerPDFFile = new FileDataProvider(
             new File("Assets/playground.pdf"));
    PdfDocument playGround = PdfDocument.open(providerPDFFile);
    DocumentEditor pdfEditor = playGround.createDocumentEditor();
}

Next, add a new empty file, chain.json, to the Assets folder. To do this, right-click the folder and select the option New > File.

Paste the JSON below and save the file:

[
	{
		"type": "removePages",
		"pageIndexes": [3, 4]
	},
	{
		"type": "rotatePages",
		"pageIndexes": [0, 1],
		"rotateBy": 180
	},
	{
		"type": "setPageLabel",
		"pageIndexes": [0],
		"pageLabel": "New Label!"
	}
]

Go back to the test code and add the logic to read the JSON from the file:

String operations = "";
    try {
    operations = new String(
            Files.readAllBytes(
            Paths.get("Assets/chain.json")));
} catch (IOException e) {
    e.printStackTrace();
}

If your IDE shows an error in the Files or IOException, select the location of the error and press Alt-Enter. Your problem will disappear instantly.

Note that you can also make an API call to read the JSON over HTTP. For the sake of simplicity, in this tutorial, you’ll only read the PDF operations JSON from a local file.

After this, create a new object of JSONArray, and pass it to the DocumentEditor.addOperations method call. Then, save the PDF to disk:

JSONArray jsonOperations = new JSONArray(operations);
pdfEditor.addOperations(jsonOperations);
savePdfFileToDisk("chainOfOps", pdfEditor);

Here’s the full Java code snippet of performing a chain of operations on a PDF:

@Test
public void chainOfOperationsOnPDF()
{
    FileDataProvider providerPDFFile = new FileDataProvider(
            new File("Assets/playground.pdf"));
    PdfDocument playGround = PdfDocument.open(providerPDFFile);
    DocumentEditor pdfEditor = playGround.createDocumentEditor();

    String operations = "";
    try {
        operations = new String(
                Files.readAllBytes(
                        Paths.get("Assets/chain.json")));
    } catch (IOException e) {
        e.printStackTrace();
    }

    System.out.println(operations);

    JSONArray jsonOperations = new JSONArray(operations);

    pdfEditor.addOperations(jsonOperations);

    savePdfFileToDisk("chainOfOps", pdfEditor);

    }

Conclusion

This post provided you with a thorough look at how to manipulate PDFs with PSPDFKit Library for Java. You can add functionality beyond editing PDFs that enables you to import pages from other documents, create annotations, add and remove pages, and much more.

The PSPDFKit Library for Java offers an easy-to-use yet powerful API for manipulating PDFs. Check out our Java guides to view the full capabilities of our PDF library. You can also download our Catalog application to get up and running quickly using our readymade examples.

Explore related topics

Free trial Ready to get started?
Free trial