Image and PDF Compression in C#
GdPicture.NET SDK enables you to dramatically reduce the file size of PDF documents, with a focus on font optimization, data compression, and image analysis.
PDF optimization involves serializing several compression algorithms to go beyond the limitations of some compression schemes. It also involves removing unwanted or unused objects in a PDF.
To compress a PDF document, follow these steps:
-
Create a
GdPicturePDFReducer
object. -
Configure the metadata of the resulting PDF document with the following properties of the
PDFReducerConfiguration
object:Property Name Description Author
Specifies the author of the resulting PDF document. Producer
Specifies the producer of the resulting PDF document. ProducerName
Specifies the name of the producer of the resulting PDF document. Title
Specifies the title of the resulting PDF document. -
Configure the compression process with the following properties of the
PDFReducerConfiguration
object:Property Name Description DownscaleImages
Specifies whether to downscale images. The default value is true
.DownscaleResolution
Specifies the resolution to downscale images. The default value is 150
.DownscaleResolutionMRC
Specifies the resolution for downscaling the background layer by the mixed raster content (MRC) engine. The default value is 100
.EnableCharRepair
Specifies whether to perform character repair during bitonal conversion. The default value is false
.EnableColorDetection
Specifies whether to perform color detection on images. The default value is true
.EnableJBIG2
Specifies whether to use the JBIG2 compression scheme to compress bitonal images. The default value is true
.EnableJPEG2000
Specifies whether to use the JPEG2000 compression scheme to compress the images. The default value is true
.EnableMRC
Specifies whether to use MRC for compressing the content of the source PDF. The default value is false
.EnableParallelization
Specifies whether to use multiple cores to speed up the process. Threads are dynamically allocated based on the real-time available CPU resources. The default value is true
.FastWebView
Specifies whether to optimize the PDF for online distribution (linearized PDF). The default value is false
.ImageQuality
Specifies the quality of the compressed images. The default value is PDFReducerImageQuality.ImageQualityMedium
.JBIG2PMSThreshold
Specifies the threshold value for the JBIG2 encoder pattern matching and substitution between 0
and1
. Any number lower than1
may lead to lossy compression. The default value is0.75
.MaxBitmapPerPage
Specifies the maximum number of bitmap images per page. OutputFormat
A member of the PDFReducerPDFVersion
enumeration that specifies the version and the conformance level of the output PDF document. The default value isPDFReducerPDFVersion.PdfVersion15
.PackDocument
Specifies whether to pack the PDF to reduce its size. The default value is true
.PackFonts
Specifies whether to pack the PDF fonts to reduce their size. The default value is true
.PreserveSmoothing
Specifies whether the MRC engine preserves smoothing between different layers. The default value is true
.RecompressImages
Specifies whether to recompress the images. The default value is true
.RemoveAnnotations
Specifies whether to remove annotations. The default value is false
.RemoveBlankPages
Specifies whether to remove blank pages. The default value is false
.RemoveBookmarks
Specifies whether to remove bookmarks. The default value is false
.RemoveEmbeddedFiles
Specifies whether to remove embedded files. The default value is false
.RemoveFormFields
Specifies whether to remove form fields. The default value is false
.RemoveHyperlinks
Specifies whether to remove hyperlinks. The default value is false
.RemoveJavaScript
Specifies whether to remove JavaScript. The default value is false
.RemoveMetadata
Specifies whether to remove metadata. The default value is false
.RemovePagePieceInfo
Specifies whether to remove the page PieceInfo
dictionary used to hold private application data. The default value istrue
.RemovePageThumbnails
Specifies whether to remove page thumbnails. The default value is false
.UnembedFonts
Specifies whether to remove embedded font data. The default value is false
. -
Run the compression process with the
ProcessDocument
method. This method takes the path to the source and the output PDF files as its parameters.
General Optimization of PDF Documents
The example below focuses on general aspects of PDF optimization such as content removal and font optimization:
using GdPicturePDFReducer gdpicturePDFReducer = new GdPicturePDFReducer(); // Configure the metadata of the resulting PDF document. gdpicturePDFReducer.PDFReducerConfiguration.Author = "GdPicture.NET PDF Reducer SDK"; gdpicturePDFReducer.PDFReducerConfiguration.Producer = "GdPicture.NET 14"; gdpicturePDFReducer.PDFReducerConfiguration.ProducerName = "PSPDFKit"; gdpicturePDFReducer.PDFReducerConfiguration.Title = "PDF Optimization"; // Specify the version and the conformance level of the output PDF document. gdpicturePDFReducer.PDFReducerConfiguration.OutputFormat = PDFReducerPDFVersion.PdfVersionRetainExisting; // Configure the compression process by removing document elements. gdpicturePDFReducer.PDFReducerConfiguration.RemoveAnnotations = true; gdpicturePDFReducer.PDFReducerConfiguration.RemoveBlankPages = true; gdpicturePDFReducer.PDFReducerConfiguration.RemoveBookmarks = true; gdpicturePDFReducer.PDFReducerConfiguration.RemoveEmbeddedFiles = true; gdpicturePDFReducer.PDFReducerConfiguration.RemoveFormFields = true; gdpicturePDFReducer.PDFReducerConfiguration.RemoveHyperlinks = true; gdpicturePDFReducer.PDFReducerConfiguration.RemoveJavaScript = true; gdpicturePDFReducer.PDFReducerConfiguration.RemoveMetadata = true; gdpicturePDFReducer.PDFReducerConfiguration.RemovePageThumbnails = true; // Optimize the output file size by packing fonts. gdpicturePDFReducer.PDFReducerConfiguration.PackFonts = true; // Optimize the output file size by packing the document. gdpicturePDFReducer.PDFReducerConfiguration.PackDocument = true; // Run the compression process. gdpicturePDFReducer.ProcessDocument(@"C:\temp\source.pdf", @"C:\temp\output.pdf");
Using gdpicturePDFReducer As GdPicturePDFReducer = New GdPicturePDFReducer() 'Configure the metadata of the resulting PDF document. gdpicturePDFReducer.PDFReducerConfiguration.Author = "GdPicture.NET PDF Reducer SDK" gdpicturePDFReducer.PDFReducerConfiguration.Producer = "GdPicture.NET 14" gdpicturePDFReducer.PDFReducerConfiguration.ProducerName = "PSPDFKit" gdpicturePDFReducer.PDFReducerConfiguration.Title = "PDF Optimization" 'Specify the version and the conformance level of the output PDF document. gdpicturePDFReducer.PDFReducerConfiguration.OutputFormat = PDFReducerPDFVersion.PdfVersionRetainExisting 'Configure the compression process by removing document elements. gdpicturePDFReducer.PDFReducerConfiguration.RemoveAnnotations = True gdpicturePDFReducer.PDFReducerConfiguration.RemoveBlankPages = True gdpicturePDFReducer.PDFReducerConfiguration.RemoveBookmarks = True gdpicturePDFReducer.PDFReducerConfiguration.RemoveEmbeddedFiles = True gdpicturePDFReducer.PDFReducerConfiguration.RemoveFormFields = True gdpicturePDFReducer.PDFReducerConfiguration.RemoveHyperlinks = True gdpicturePDFReducer.PDFReducerConfiguration.RemoveJavaScript = True gdpicturePDFReducer.PDFReducerConfiguration.RemoveMetadata = True gdpicturePDFReducer.PDFReducerConfiguration.RemovePageThumbnails = True 'Optimize the output file size by packing fonts. gdpicturePDFReducer.PDFReducerConfiguration.PackFonts = True 'Optimize the output file size by packing the document. gdpicturePDFReducer.PDFReducerConfiguration.PackDocument = True 'Run the compression process. gdpicturePDFReducer.ProcessDocument(@"C:\temp\source.pdf", @"C:\temp\output.pdf") End Using
Recompressing Images
Compress PDF documents by recompressing existing images in a file. For example, decreasing unnecessarily high resolutions can dramatically reduce the file size without affecting the viewing experience.
The example below shows how to recompress images:
using GdPicturePDFReducer gdpicturePDFReducer = new GdPicturePDFReducer(); // Configure the metadata of the resulting PDF document. gdpicturePDFReducer.PDFReducerConfiguration.Author = "GdPicture.NET PDF Reducer SDK"; gdpicturePDFReducer.PDFReducerConfiguration.Producer = "GdPicture.NET 14"; gdpicturePDFReducer.PDFReducerConfiguration.ProducerName = "PSPDFKit"; gdpicturePDFReducer.PDFReducerConfiguration.Title = "Re-Compress Images"; // Specify the version and the conformance level of the output PDF document. gdpicturePDFReducer.PDFReducerConfiguration.OutputFormat = PDFReducerPDFVersion.PdfVersionRetainExisting; // Recompress images to obtain a better compression ratio. gdpicturePDFReducer.PDFReducerConfiguration.RecompressImages = true; gdpicturePDFReducer.PDFReducerConfiguration.ImageQuality = PDFReducerImageQuality.ImageQualityHigh; // Reduce the image size by decreasing the image resolution. gdpicturePDFReducer.PDFReducerConfiguration.DownscaleImages = true; gdpicturePDFReducer.PDFReducerConfiguration.DownscaleResolution = 200; // Run the compression process. gdpicturePDFReducer.ProcessDocument(@"C:\temp\source.pdf", @"C:\temp\output.pdf");
Using gdpicturePDFReducer As GdPicturePDFReducer = New GdPicturePDFReducer() 'Configure the metadata of the resulting PDF document. gdpicturePDFReducer.PDFReducerConfiguration.Author = "GdPicture.NET PDF Reducer SDK" gdpicturePDFReducer.PDFReducerConfiguration.Producer = "GdPicture.NET 14" gdpicturePDFReducer.PDFReducerConfiguration.ProducerName = "PSPDFKit" gdpicturePDFReducer.PDFReducerConfiguration.Title = "Re-Compress Images" 'Specify the version and the conformance level of the output PDF document. gdpicturePDFReducer.PDFReducerConfiguration.OutputFormat = PDFReducerPDFVersion.PdfVersionRetainExisting 'Recompress images to obtain a better compression ratio. gdpicturePDFReducer.PDFReducerConfiguration.RecompressImages = True gdpicturePDFReducer.PDFReducerConfiguration.ImageQuality = PDFReducerImageQuality.ImageQualityHigh 'Reduce the image size by decreasing the image resolution. gdpicturePDFReducer.PDFReducerConfiguration.DownscaleImages = True gdpicturePDFReducer.PDFReducerConfiguration.DownscaleResolution = 200 'Run the compression process. gdpicturePDFReducer.ProcessDocument(@"C:\temp\source.pdf", @"C:\temp\output.pdf") End Using
Controlling Image Compression
The PDF specification allows for seven compression schemes, all of which can be used to compress images. For example, two popular compression schemes are the following:
-
JBIG2 for bitonal images (usually black and white).
-
JPEG2000 for 24-bit color and 8-bit grayscale images.
The example below uses both of these schemes to compress images in a PDF document:
using GdPicturePDFReducer gdpicturePDFReducer = new GdPicturePDFReducer(); // Configure the metadata of the resulting PDF document. gdpicturePDFReducer.PDFReducerConfiguration.Author = "GdPicture.NET PDF Reducer SDK"; gdpicturePDFReducer.PDFReducerConfiguration.Producer = "GdPicture.NET 14"; gdpicturePDFReducer.PDFReducerConfiguration.ProducerName = "PSPDFKit"; gdpicturePDFReducer.PDFReducerConfiguration.Title = "Image Compression"; // Specify the version and the conformance level of the output PDF document. gdpicturePDFReducer.PDFReducerConfiguration.OutputFormat = PDFReducerPDFVersion.PdfVersionRetainExisting; // Enable automatic color detection. gdpicturePDFReducer.PDFReducerConfiguration.EnableColorDetection = true; // Repair characters. gdpicturePDFReducer.PDFReducerConfiguration.EnableCharRepair = true; // Control image compression. gdpicturePDFReducer.PDFReducerConfiguration.EnableJPEG2000 = true; gdpicturePDFReducer.PDFReducerConfiguration.EnableJBIG2 = true; gdpicturePDFReducer.PDFReducerConfiguration.JBIG2PMSThreshold = 0.65f; // Run the compression process. gdpicturePDFReducer.ProcessDocument(@"C:\temp\source.pdf", @"C:\temp\output.pdf");
Using gdpicturePDFReducer As GdPicturePDFReducer = New GdPicturePDFReducer() 'Configure the metadata of the resulting PDF document. gdpicturePDFReducer.PDFReducerConfiguration.Author = "GdPicture.NET PDF Reducer SDK" gdpicturePDFReducer.PDFReducerConfiguration.Producer = "GdPicture.NET 14" gdpicturePDFReducer.PDFReducerConfiguration.ProducerName = "PSPDFKit" gdpicturePDFReducer.PDFReducerConfiguration.Title = "Image Compression" 'Specify the version and the conformance level of the output PDF document. gdpicturePDFReducer.PDFReducerConfiguration.OutputFormat = PDFReducerPDFVersion.PdfVersionRetainExisting 'Enable automatic color detection. gdpicturePDFReducer.PDFReducerConfiguration.EnableColorDetection = True 'Repair characters. gdpicturePDFReducer.PDFReducerConfiguration.EnableCharRepair = True 'Control image compression. gdpicturePDFReducer.PDFReducerConfiguration.EnableJPEG2000 = True gdpicturePDFReducer.PDFReducerConfiguration.EnableJBIG2 = True gdpicturePDFReducer.PDFReducerConfiguration.JBIG2PMSThreshold = 0.65F 'Run the compression process. gdpicturePDFReducer.ProcessDocument(@"C:\temp\source.pdf", @"C:\temp\output.pdf") End Using