Export annotation data from PDFs in C# .NET
PDF
To export data from a PDF annotation, follow the steps below:
- Create a
GdPicturePDF
object. - Load the PDF file with the
LoadFromFile
method. - Loop through all PDF pages.
- Select the PDF page from where to get the annotation data with the
SelectPage
method. - Get the total number of annotations in the PDF document with the
GetAnnotationCount
method. - Loop through all annotations.
- Use a method to get the annotation data and save it to a variable. For more information, refer to the guide on getting PDF annotation properties.
The following code example gets the index number, type, and contents of all annotations and saves them to a CSV file:
using GdPicturePDF gdpicturePDF = new GdPicturePDF();gdpicturePDF.LoadFromFile(@"C:\temp\source.pdf");// Create a `StringBuilder` variable to store data.StringBuilder data = new StringBuilder();// Add headers to the first line.String[] headers = { "Index", "Annotation Type", "Contents", "Page" };data.AppendLine(string.Join(";", headers));// Get the number of pages.int pageCount = gdpicturePDF.GetPageCount();// Loop through all pages.for(int page = 1; page <= pageCount; page++){ // Select a PDF page. gdpicturePDF.SelectPage(page); int annotCount = gdpicturePDF.GetAnnotationCount(); // Loop through all annotations. for (int i = 0; i < annotCount; i++) { // Get the current annotation type. string annotType = gdpicturePDF.GetAnnotationSubType(i); // Get the current annotation contents. string annotContents = gdpicturePDF.GetAnnotationContents(i); // Add a new line to the `StringBuilder` with the annotation data. String[] newLine = { i.ToString(), annotType.ToString(), annotContents.ToString(), page.ToString() }; data.AppendLine(string.Join(";", newLine)); }}// Save the collected data to a CSV file.String formData = @"C:\temp\output.csv";File.AppendAllText(formData, data.ToString());
Using gdpicturePDF As GdPicturePDF = New GdPicturePDF() gdpicturePDF.LoadFromFile("C:\temp\source.pdf") ' Create a `StringBuilder` variable to store data. Dim data As StringBuilder = New StringBuilder() ' Add headers to the first line. Dim headers = {"Index", "Annotation Type", "Contents", "Page"} data.AppendLine(String.Join(";", headers)) ' Get the number of pages. Dim pageCount As Integer = gdpicturePDF.GetPageCount() ' Loop through all pages. For page = 1 To pageCount ' Select a PDF page. gdpicturePDF.SelectPage(page) Dim annotCount As Integer = gdpicturePDF.GetAnnotationCount() ' Loop through all annotations. For i = 0 To annotCount - 1 ' Get the current annotation type. Dim annotType As String = gdpicturePDF.GetAnnotationSubType(i) ' Get the current annotation contents. Dim annotContents As String = gdpicturePDF.GetAnnotationContents(i) ' Add a new line to the `StringBuilder` with the annotation data. Dim newLine As String() = {i.ToString(), annotType.ToString(), annotContents.ToString(), page.ToString()} data.AppendLine(String.Join(";", newLine)) Next Next ' Save the collected data to a CSV file. Dim formData = "C:\temp\output.csv" File.AppendAllText(formData, data.ToString())End Using
Related topics