How to Optimize PDF Files using a C# PDF API

Chelsea Devereaux - Mar 3 - - Dev Community

Tutorial Concept
Learn how to programmatically reduce PDF file size and improve performance in C# with advanced optimization techniques using a .NET PDF API library.

What You Will Need

  • Document Solutions for PDF NuGet Package
  • .NET 8+

Controls Referenced


As technology advances, optimization is key to improving efficiency. For PDFs, this means reducing file size and removing unused objects, which ensures faster loading, easier sharing, and efficient storage without compromising quality. MESCIUS’ Document Solutions for PDF (DsPdf) API helps developers create, modify, and optimize PDFs programmatically, providing full control over document handling with the CompressionLevel property that allows different compression levels to be set.

In this blog, we’ll explore some additional effective PDF optimization techniques that we can achieve using DsPdf, including:

Eliminating Redundant Images

PDFs often contain duplicate images when the same image appears multiple times across pages, unnecessarily increasing file size. This leads to higher storage needs and difficulty in sharing. By keeping a single instance of an image and referencing it throughout the document, file size can be significantly reduced.

DsPdf provides the RemoveDuplicateImages method, which detects and eliminates redundant images from the PDF. An example demonstrating how to use this method is below:

    // Initialize GcPdfDocument.
    GcPdfDocument doc = new GcPdfDocument();

    // Open PDF document in the file stream.
    FileStream fs = File.OpenRead("Invoice.pdf");

    // Load the PDF document.
    doc.Load(fs);

    // Remove duplicate images.
    doc.RemoveDuplicateImages();

    // Save PDF document.
    doc.Save("RemovedDuplicateImages.pdf");
Enter fullscreen mode Exit fullscreen mode

Compression

You can download the sample to remove duplicate images from your PDF.

Optimizing Fonts

Sometimes PDFs can contain unused or duplicate fonts, which occurs when multiple subsets of the same font are embedded. Excessive font data can slow down document processing and consume more storage space. By merging font subsets, file size can be significantly reduced while maintaining text integrity. DsPdf simplifies this with the OptimizeFonts method in the GcPdfDocument class.

    // Initialize GcPdfDocument.
    GcPdfDocument doc = new GcPdfDocument();

    // Open PDF document in the file stream.
    FileStream fs = File.OpenRead("Invoice.pdf");

    // Load the PDF document.
    doc.Load(fs);

    // Optimize the font usage.
    doc.OptimizeFonts();

    // Save PDF document.
    doc.Save("RemovedDuplicateImages.pdf");
Enter fullscreen mode Exit fullscreen mode

Removing Embedded Files

A PDF with embedded files stores attachments like images or documents, increasing file size and slowing performance. Removing unnecessary attachments, such as redundant or archived files, helps optimize the document. DsPdf makes this easier with the Clear method of the FileSpecification collection that we can access through the EmbeddedFiles property. This reduces file size for better storage and sharing.

    // Initialize GcPdfDocument.
    GcPdfDocument doc = new GcPdfDocument();

    // Open PDF document in the file stream.
    FileStream fs = File.OpenRead("Invoice.pdf");

    // Load the PDF document.
    doc.Load(fs);

    // Optimize the font usage.
    doc.OptimizeFonts();

    // Save PDF document.
    doc.Save("RemovedDuplicateImages.pdf");
Enter fullscreen mode Exit fullscreen mode

Flattening PDF Forms

Interactive PDF forms store fields separately, inflating file size and decreasing performance. They also remain editable, posing security risks. Flattening converts these fields into static content, reducing size, improving rendering, and preventing edits. This makes documents easier to share and archive without unintended modifications. DsPdf streamlines this with the FormXObject class and DrawAnnotations method. The example below shows how to flatten PDFs using the DsPdf API:

    GcPdfDocument srcDoc = new GcPdfDocument();
     srcDoc.Load(fs);
    // Draw all pages and annotation of the source PDF into a new PDF:
    var doc = new GcPdfDocument();
    foreach (var srcPage in srcDoc.Pages)
    {
    var page = doc.Pages.Add();
    var fxo = new FormXObject(doc, srcPage);
    page.Graphics.DrawForm(fxo, page.Bounds, null, ImageAlign.Default);
    // This method draws all annotations on the page including form field widgets:
    srcPage.DrawAnnotations(page.Graphics, page.Bounds);
    }
    doc.Save(stream);
Enter fullscreen mode Exit fullscreen mode

Reducing Document Size Using Object Streams

Large PDFs contain unoptimized object storage, which makes handling, sharing, and storage more cumbersome. Object streams offer a solution by storing indirect objects more compactly using compression instead of placing them at the file’s outermost level. This significantly reduces PDF size while maintaining content integrity.

DsPdf streamlines this optimization with the UseObjectStreams property in the SavePdfOptions class, giving developers control over how object streams are applied. The ObjectStreamOptions include the following settings:

None: Disables object streams, storing objects in their standard format.

Single: Uses one object stream for the entire document, significantly reducing file size but potentially causing a slight delay when opening the PDF.

Multiple: Distributes objects across multiple streams, slightly increasing file size compared to Single but ensuring faster document loading in PDF viewers.

    // Initialize GcPdfDocument.
    GcPdfDocument doc = new GcPdfDocument();

    // Open PDF document in the file stream.
    FileStream fs = File.OpenRead("Invoice.pdf");

    // Load the PDF document.
    doc.Load(fs);

    //Save the pdf
    doc.Save(tmpOutput, new SavePdfOptions(SaveMode.Default, PdfStreamHandling.MinimizeSize, UseObjectStreams.Single));
Enter fullscreen mode Exit fullscreen mode

Compression

You can download this sample to optimize PDFs using Object Streams.

Conclusion

In this blog, we covered key DsPdf techniques, such as removing redundant images, optimizing fonts, clearing embedded files, flattening forms, and using object streams. These methods streamline PDFs while maintaining quality.

Beyond optimization, DsPdf also supports advanced PDF manipulation, offering powerful tools for developers. Start leveraging these features today for more efficient document management.

Explore our Product Demo and Documentation for more insights.

If you have any questions, feel free to ask in the comments below!

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .