Converting Office files to PDF Format using a Web Services based interface
One of the key changes introduced with the release of the Muhimbi PDF Converter Services and API 3.0 is the ability to convert typical Office files via a web services based interface. This makes it very simple to convert typical Office files to PDF format from your own .NET, Java or any other web services capable environment.
This post describes the key features of the web services based interface and provides a simple example describing how to convert a document to PDF format. Source code for a more comprehensive demo is available for download as well. Feel free to contact us if you have any questions.
Prerequisites
Let’s make sure all prerequisites are in place before we start our tutorial.
-
Download the PDF Converter Services.
-
Install it in-line with chapter 2 of the included Administration Guide.
Key Features
Key Features of the Muhimbi Server Platform are:
-
Convert popular document types to PDF or XPS format with near perfect fidelity. At the time of writing support is available for MS-Word, PowerPoint, Excel, InfoPath, Visio and MS-Publisher, but by the time you are reading this additional document formats may have been added.
-
Scalable architecture that allows multiple conversions to run in parallel.
-
Runs as a Windows Service. No need to install or configure IIS or other web service frameworks.
-
Convert password protected documents.
-
Apply security settings to generated PDF files including encryption, password protection and multiple levels of PDF Security options to prevent users from printing documents or copy a document’s content.
-
Generate a regular PDF file or a file in PDF/A format.
-
Generate high resolution PDF Files optimised for printing or normal resolution files optimised for use on screen.
-
Dynamically refresh a document’s content before generating the PDF. Ideal for merging content from SharePoint custom columns into your PDF file.
-
Control how to deal with hidden / selected content such as PowerPoint Slides and Excel worksheets.
In addition to the features described above, the MDCS software stack also contains a layer of functionality to control concurrency, request queuing and watchdog services to deal with unresponsive and runaway processes. More detail can be found in the brochure.
Object Model
Although the Object Model exposed by the web service is easy to understand, the system provides very powerful functionality and fine grained control to specify how the PDF file is generated.
As outlined in the image below, the web service contains 3 methods:
-
Convert: Convert the file in the sourceFile byte array using the specified openOptions and conversionSettings. The generated PDF or XPS file is returned as a byte array as well.
-
GetConfiguration: Retrieve information about which converters are supported and the associated file extensions. Consider calling this service once to retrieve a list of valid file extensions and check if a file is supported before it is submit to the web service. This will prevent a lot of redundant traffic and will increase scalability.
-
GetDiagnostics: Run a diagnostics test that carries out an internal end-to-end test for each supported document type. Call this method to check if the service and all prerequisites have been deployed correctly.
The full object model is available in the following diagram. Click to enlarge it.
Simple example code
The following sample shows the minimum steps required to convert a document to PDF format. In our example we are using Visual Studio and C#, but any environment that can invoke web services should be able to access the required functionality. Note that the WSDL can be found at https://localhost:41734/Muhimbi.DocumentConverter.WebService/?wsdl. A Java based example is installed alongside the product and discussed in the User & Developer Guide.
This example does not explicitly set ConversionSettings.Format. As a result the file is converted to the default PDF format. It is possible to convert files to other file formats as well by setting this property to a value of the OutputFormat enumeration. For details see this blog post.
-
Start a new Visual Studio project and use the project type of your choice. In this example we are using a standard .net 3.0 project of type Windows Forms Application. Name it ‘Simple PDF Converter Sample’.
-
Add a TextBox and Button control button to the form. Accept the default names of textBox1 and button1.
-
In the Solution Explorer window, right-click References and select Add Service Reference.
-
In the Address box enter the WSDL address listed in the introduction of this section. If the MDCS is located on a different machine then substitute localhost with the server’s name.
-
Accept the default Namespace of ServiceReference1 and click the OK button to generate the proxy classes.
-
Double click Button1 and replace the content of the entire code file with the following:
using System; using System.IO; using System.ServiceModel; using System.Windows.Forms; using Simple_PDF_Converter_Sample.ServiceReference1; namespace Simple_PDF_Converter_Sample { public partial class Form1 : Form { // ** The URL where the Web Service is located. Amend host name if needed. string SERVICE_URL = "https://localhost:41734/Muhimbi.DocumentConverter.WebService/"; public Form1() { InitializeComponent(); } private void button1_Click(object sender, EventArgs e) { DocumentConverterServiceClient client = null; try { // ** Determine the source file and read it into a byte array. string sourceFileName = textBox1.Text; byte[] sourceFile = File.ReadAllBytes(sourceFileName); // ** Open the service and configure the bindings client = OpenService(SERVICE_URL); //** Set the absolute minimum open options OpenOptions openOptions = new OpenOptions(); openOptions.OriginalFileName = Path.GetFileName(sourceFileName); openOptions.FileExtension = Path.GetExtension(sourceFileName); // ** Set the absolute minimum conversion settings. ConversionSettings conversionSettings = new ConversionSettings(); conversionSettings.Fidelity = ConversionFidelities.Full; conversionSettings.Quality = ConversionQuality.OptimizeForPrint; // ** Carry out the conversion. byte[] convFile = client.Convert(sourceFile, openOptions, conversionSettings); // ** Write the converted file back to the file system with a PDF extension. string destinationFileName = Path.GetDirectoryName(sourceFileName) + @"\" + Path.GetFileNameWithoutExtension(sourceFileName) + "." + conversionSettings.Format; using (FileStream fs = File.Create(destinationFileName)) { fs.Write(convFile, 0, convFile.Length); fs.Close(); } MessageBox.Show("File converted to " + destinationFileName); } catch (FaultException<WebServiceFaultException> ex) { MessageBox.Show("FaultException occurred: ExceptionType: " + ex.Detail.ExceptionType.ToString()); } catch (Exception ex) { MessageBox.Show(ex.ToString()); } finally { CloseService(client); } } /// <summary> /// Configure the Bindings, endpoints and open the service using the specified address. /// </summary> /// <returns>An instance of the Web Service.</returns> public static DocumentConverterServiceClient OpenService(string address) DocumentConverterServiceClient client = null; try { BasicHttpBinding binding = new BasicHttpBinding(); // ** Use standard Windows Security. binding.Security.Mode = BasicHttpSecurityMode.TransportCredentialOnly; binding.Security.Transport.ClientCredentialType = HttpClientCredentialType.Windows; // ** Increase the Timeout to deal with (very) long running requests. binding.SendTimeout = TimeSpan.FromMinutes(30); binding.ReceiveTimeout = TimeSpan.FromMinutes(30); // ** Set the maximum document size to 40MB binding.MaxReceivedMessageSize = 50*1024*1024; binding.ReaderQuotas.MaxArrayLength = 50 * 1024 * 1024; binding.ReaderQuotas.MaxStringContentLength = 50 * 1024 * 1024; // ** Specify an identity (any identity) in order to get it past .net3.5 sp1 EndpointIdentity epi = EndpointIdentity.CreateUpnIdentity("unknown"); EndpointAddress epa = new EndpointAddress(new Uri(address), epi); client = new DocumentConverterServiceClient(binding, epa); client.Open(); return client; } catch (Exception) { CloseService(client); throw; } } /// <summary> /// Check if the client is open and then close it. /// </summary> /// <param name="client">The client to close</param> public static void CloseService(DocumentConverterServiceClient client) { if (client != null && client.State == CommunicationState.Opened) client.Close(); } } }
Providing the project and all controls are named as per the steps above, it should compile without errors. Run it, enter the full path to the source file, e.g. an MS-Word document, and click the button to start the conversion process. The conversion may take a few second depending on the complexity of the document.
Note that In this example we are programmatically configuring the WCF Bindings and End Points. If you wish you can use a declarative approach using the config file.
Download the source code including a compiled binary.
Complex sample code
In order to carry out internal testing we have developed an application that can be used to control each end every function exposed by the web services. The full source code as well as a compiled binary can be downloaded below.
Note that although the test harness works well and can be used to batch convert a large number of documents, this is not commercial grade code. Use at your own risk.
Download the source code including a compiled binary.
Final notes
If you wish to access the PDF Converter from your own custom SharePoint code, you may want to consider using our high level Wrapper methods. If you are not using the wrapper methods then please make sure you are invoking the web service from a user who has privileges to do so. By wrapping the code in SPSecurity.RunWithElevatedPrivileges you will automatically connect using an account in the WSS_WPG windows group, which has access by default.
.
Clavin is a Microsoft Business Applications MVP who supports 1,000+ high-level enterprise customers with challenges related to PDF conversion in combination with SharePoint on-premises Office 365, Azure, Nintex, K2, and Power Platform mostly no-code solutions.