Blog Post

Merge Files to PDF using custom Merge Settings and Muhimbi's XML Override

Jonathan D. Rhyne
Illustration: Merge Files to PDF using custom Merge Settings and Muhimbi's XML Override

At the time of writing, Muhimbi’s range of PDF Conversion and Document Manipulation servers and APIs have been in the market for nearly 12 years. It will come as no great surprise that during those 12 years we have received many questions from customers to implement all kinds of arcane features to suit their particular requirements.

When implementing feature requests, we have always applied one simple rule, which is that we are happy to implement new functionality providing it can be used by all our customers and is generic in nature.

Recently, a large international sports organisation approached us to to replace their legacy on-premise system with our cloud based service. Our software ticked most boxes, but some edge cases were identified for functionality that we did not support, specifically:

  1. Create PDF Bookmarks (and therefore a Table of Contents) based on MS-Word styles that are not defined as headings.

  2. Maintain the correct hierarchy of PDF Bookmarks for MS-Word files that don’t start with a Heading 1.

Pretty esoteric stuff…. How can we expose niche functionality like this in our system, and user interfaces, without confusing thousands of users that have no interest in this functionality? Well, it turns out we have dealt with this before as we introduced the concept of XML Override to our Convert Document action all the way back in 2012. Using a bit of XML you can set or override almost any setting supported by our comprehensive object model.

So, we added an XML Override facility to our Merge action as well. At the time of writing this new facility is available in Power Automate (Flow) and in our REST based API. In a next release we’ll add this to SharePoint Designer and Nintex Workflow actions as well. Naturally all this functionality is available natively on our SOAP API.

Let’s take the following example, where we merge documents as normal, but with the following changes:

  1. Only apply different rules for MS-Word files that are being merged. To accomplish this we have specified a regular expression on the SourceFile element, which filters on the field specified in the SourceFiles element.

  2. Only generate PDF Bookmarks for the first 3 Heading levels and ignore everything else. We achieve this by setting LowerBookmarkLevel to 3.

  3. Map a custom style named ‘MyFakeHeadingStyle’, which is not defined in MS-Word as a heading style, to heading level 2. We achieve this by defining it in the list of Bookmark Mappings

This results in the following XML.

<Override>
    <ProcessingOptions>
        <SourceFiles filter="property:SourceFile.OpenOptions.FileExtension">
            <SourceFile filterValue="regex:^docx$">
                <ConversionSettings>
                    <GenerateBookmarks>Custom</GenerateBookmarks>
                    <ConverterSpecificSettings type="ConverterSpecificSettings_WordProcessing">
                        <BookmarkOptions>
                            <UseHeadingStyles>True</UseHeadingStyles>
                            <LowerBookmarkLevel>3</LowerBookmarkLevel>
                            <BookmarkMappings>
                                <BookmarkMapping>
                                    <Source>MyFakeHeadingStyle</Source>
                                    <Level>2</Level>
                                </BookmarkMapping>
                            </BookmarkMappings>
                        </BookmarkOptions>
                    </ConverterSpecificSettings>
                </ConversionSettings>
            </SourceFile>
        </SourceFiles>
    </ProcessingOptions>
</Override>

We can take this XML and paste it in the ‘Override settings’ field of our Power Automate Merge documents action. A full example of ‘iterating over multiple files and compiling a list of files to merge’ is beyond the scope of this post. An example can be found here.

More details can be found in the Developer Guide. This does require some technical knowledge though.

If you get stuck, leave a comment below or contact our support desk, we love to help.

Author
Jonathan D. Rhyne Co-Founder and CEO

Jonathan joined Nutrient in 2014. As CEO, Jonathan defines the company’s vision and strategic goals, bolsters the team culture, and steers product direction. When he’s not working, he enjoys being a dad, photography, and soccer.

Explore related topics

Share post
Free trial Ready to get started?
Free trial