At Muhimbi we are continuously improving our range of PDF Conversion products such as the PDF Converter for SharePoint and the Muhimbi PDF Converter Services. The way we prioritise new features is simple, we take the list of our customers’ feature requests, order it by number of times requested and we develop away. It really is that simple.
We recently delivered our #1 feature request, which is compatibility with SharePoint 2010 and Office 2010. Next up is native support for watermarking (more about that in a future post) and the ability for our customers to add custom converters that fit their exact needs. This post provides an example of how to create your own Custom Converter and add it to the Document Conversion Service. Note that you need version 3.5 or newer to make use of custom converters.
Update: As of version 6.1 it is also possible to use existing 3rd party conversion software in combination with Muhimbi’s range of PDF Conversion products.
The Muhimbi PDF Converter Services allows custom converters to be added with relative ease. This is useful for converting file types specific to your organisation or for file types that have not (yet) been implemented by Muhimbi in the main product.
The following example shows how to replace the existing MS Word converter, which relies on MS Word being present, with a simple third party converter that works well enough for simple documents. Some programming knowledge is required.
The steps are as follows:
-
Create a new Visual Studio project, select Class Library (C#, .net 3.0) as the template and give the project an appropriate name, e.g. CustomConverters.
-
Add references to the following DLLs. They are located in the folder the Muhimbi Document Conversion Service has been installed in, usually c:\Program Files\Muhimbi….
• Muhimbi.dll
• Muhimbi.DocumentConverter.WebService.dll
• Muhimbi.DocumentConverter.WebService.Data.dll
• System.Runtime.Serialization (Add reference from the .NET tab, you can’t ‘Browse’ for this DLL)
-
Change your project’s Assembly name and default namespace to something sensible, e.g. Muhimbi.DocumentConverter.WebService.CustomConverters. This can be done by right-clicking on your project and selecting Properties. The Application tab allows these settings to be changed.
-
Delete the automatically generated class1.cs file and add a new class named WordConverter.cs. Make sure the class definition is public.
-
Inherit Muhimbi.DocumentConverter.WebService.AbstractDocumentConverter in the WordConverter class and implement the members ( right-click on the base class name and select Implement Abstract Class).
-
Add the following 2 constructors and make sure they call the base constructors.
public WordConverter() : base() { } public WordConverter(Stream sourceFile, OpenOptions openOptions, ConversionSettings conversionSettings) : base(sourceFile, openOptions, conversionSettings) { }
-
Next up, we need to implement the RunDiagnostics method. This method is normally used to carry out an internal end-to-end conversion to verify that the converter and all prerequisites have been installed correctly. In this test we simply return a new DiagnosticResultItem with the Valid property set to true.
public override DiagnosticResultItem RunDiagnostics() { DiagnosticResultItem dri = new DiagnosticResultItem(); dri.Valid = true; return dri; }
-
If we need to look further than just the file extension to determine the file type then we can optionally override the CanConvert method and look inside the stream (available in the _sourceFile member variable). This is not necessary for this sample converter, but an example is provided below.
public override bool CanConvert(string[] fileExtensions) { // ** Do we know anything about this extension if (base.CanConvert(fileExtensions) == false) return false; // ** Investigate in more detail ...implement your own... }
-
The next and final method to implement is the actual Convert method, which is where all the magic happens. As it is not feasible to develop an MS-Word to PDF converter from scratch, the sensible thing to do is to use a third party library such as SyncFusion DocIO or Aspose.Words (download the archive that contains just the DLLs). In this example we are going to use Aspose’s library for processing MS-Word files. It is not perfect, but for some documents such as forms and simple text documents it works very well.
Copy Aspose.Words.dll into the project directory and add a reference to it. Copy the following code into the WordConverter class.
public override Stream Convert() { try { // Validate as certain options are not supported by this converter if (_openOptions.AllowMacros != MacroSecurityOption.None) Logger.Warn("Macros are not supported by this converter."); // Set the licences for Aspose.Words. Aspose.Words.License wordLicence = new Aspose.Words.License(); //wordLicence.SetLicense("Enter your license in here."); Document asposeDocument = new Document(_sourceFile, null, LoadFormat.Auto, _openOptions.Password); // Do we need to refresh the fields etc? if (_openOptions.RefreshContent == true) asposeDocument.Range.UpdateFields(); // Convert the Document to PDF and save it as a memory stream. if (_conversionSettings.Format == OutputFormat.PDF) { MemoryStream convertedStream = new MemoryStream(); PdfOptions options = new PdfOptions(); // Specify the PDF Profile if (_conversionSettings.PDFProfile == PDFProfile.PDF_1_5) options.Compliance = PdfCompliance.Pdf15; else options.Compliance = PdfCompliance.PdfA1b; // How to deal with bookmarks if (_conversionSettings.GenerateBookmarks == BookmarkGenerationOption.Automatic) options.HeadingsOutlineLevels = 9; else if (_conversionSettings.GenerateBookmarks == BookmarkGenerationOption.Custom) options.BookmarksOutlineLevel = 9; // Correct the start and end pages if needed int startPage = _conversionSettings.StartPage != 0 ? _conversionSettings.StartPage - 1 : 0; int pageCount = asposeDocument.PageCount - startPage; if (_conversionSettings.EndPage != 0) pageCount = Math.Min(_conversionSettings.EndPage - startPage, pageCount); // Carry out the actual conversion asposeDocument.SaveToPdf(startPage, pageCount, convertedStream, options); return convertedStream; } else { throw new NotSupportedException("Outputformat '" + _conversionSettings.Format + "' not supported by this Converter."); } } catch (UnsupportedFileFormatException ex) { throw new WebServiceInternalException( WebServiceExceptionType.FileFormatNotSupported, ex.Message); } catch (Exception ex) { string message = "An error occurred while converting a file"; if (_openOptions != null && _openOptions.OriginalFileName != null) message += " - " + _openOptions.OriginalFileName; Logger.Error(message, ex); throw new WebServiceInternalException(WebServiceExceptionType.InternalError, message); } }
-
Compile the project and copy the output DLL as well as Aspose.Words.dll to the directory that holds the Muhimbi Document Conversion Service.
-
Edit the service’s config file and make the following changes:
• If the file extensions for the new converter are currently handled by a different converter then remove these extensions
from the existing converter.
• Add the definition for the new converter to the config file as per the following example.
<add key="CustomWordConverter" description="Custom MS-Word Converter" fidelity="Full" supportedExtensions="doc,docx" type="Muhimbi.DocumentConverter.WebService.CustomConverters.WordConverter, Muhimbi.DocumentConverter.WebService.CustomConverters, Version=1.0.0.0, Culture=neutral, PublicKeyToken=null" />
-
Restart the service to activate the changes.
Net stop “Muhimbi Document Converter Service”
Net start “Muhimbi Document Converter Service”
-
Finally test if everything is working correctly, either from:
• SharePoint: Open Central Administration / Application Management / Muhimbi Document Converter Settings, verify
that the new converter is added to the list, check the tick box and click Validate Settings. If everything is working
correctly then don’t forget to save the changes using the OK button.
• Winforms Diagnostics Tool: Launch the Diagnostics Tool from the Windows Start Menu, navigate to the WS Diagnose
Tab and click the Request Diagnostics button. Verify the new converter is listed and Valid = True.
Any errors are logged to the Windows Application Event Log.
Congratulations, you have created your first custom converter. The source code for the WordConverter.cs class can be downloaded here.
Exception handling
Although you can let exceptions bubble up, we recommend catching any exceptions, looking at the root cause of the problem and then throw a specific WebServiceInternalException using one of the following exception types.
public enum WebServiceExceptionType { <summary> // Unknown error </summary> // Unknown, <summary> // File format not supported </summary> // FileFormatNotSupported, <summary> // File corrupt </summary> // CorruptDocument, <summary> // An error occurred while opening the file </summary> // ErrorOpeningFile, <summary> // Conversion process timeout </summary> // ConversionTimeOut, <summary> // Application hang. Can happen when document is password protected </summary> // ConverterNotResponding, <summary> // The underlying converter has not been installed or not correctly installed. </summary> // ConverterNotInstalled, <summary> // Internal Validation (Should only happen during development) </summary> // InternalError }
Clavin is a Microsoft Business Applications MVP who supports 1,000+ high-level enterprise customers with challenges related to PDF conversion in combination with SharePoint on-premises Office 365, Azure, Nintex, K2, and Power Platform mostly no-code solutions.