How to Convert Office to PDF (Word, Excel & PPT) using Java
As we have been receiving an increasing number of requests for Java based sample code for the Muhimbi PDF Converter Services, we have decided to lift the relevant chapter from the Developer Guide and publish it in this blog post. A .NET version of this post is available here.
For those not familiar with the product, the MDCS is a server based SDK that allows software developers to convert typical Office files, including MS-Word, Excel, PowerPoint, Visio, Publisher and InfoPath, to PDF format using a robust, scalable but friendly Web Services interface from Java and .NET based solutions.
Even though the MDCS itself must run on a Windows based server, it has been designed to interoperate with non-Windows platforms such as Java. This section describes how to convert documents to PDF format using a Java based environment.
The full version of the sample code discussed in this post, including pre generated proxies, is installed alongside each copy of the MDCS.
The example described below assumes the following:
-
The JDK has been installed and configured.
-
The MDCS and all prerequisites have been installed in line with the Administration Guide.
-
The MDCS is running in the default anonymous mode. This is not an absolute requirement, but it makes initial experimentation much easier.
The first step is to generate proxy classes for the web service by executing the following command:
wsimport https://localhost:41734/Muhimbi.DocumentConverter.WebService/?wsdl -d src -Xnocompile -p com.muhimbi.ws
Feel free to change the package name and destination directory to something more suitable for your organisation.
If the Muhimbi Conversion Service is not located on the same system as where wsimport is executed then change localhost to the name of the server running the Conversion Service. You will also need to change the host name in the Conversion Service’s config file. A convenient shortcut to the Installation folder is located in the Muhimbi Start Menu. Open Muhimbi.DocumentConverter
.Service.exe.config, search for baseAddress and change the host name. Restart the Muhimbi Document Converter Service to activate the change.
Wsimport automatically generates the Java class names. Unfortunately some of the generated names are rather long and ugly so you may want to consider renaming some, particularly the Exception classes, to something friendlier. This, however, means that if you ever run wsimport again you will need to re-apply those changes. For more information have a look at the high level overview of the Object Model exposed by the web service.
Once the proxy classes have been created add the following sample code to your project. Run the code and make sure the path to the document to convert is specified on the command line. ( Download Source Code)
This example sets ConversionSettings.Format to OutputFormat.PDF. As a result the file is converted to the default PDF format. It is possible to convert files to other file formats as well by setting this property to a different value. For details see this blog post.
package com.muhimbi.app; import com.muhimbi.ws.*; import java.io.*; import java.net.URL; import java.util.List; import javax.xml.bind.JAXBElement; import javax.xml.namespace.QName; public class WsClient { private final static String DOCUMENTCONVERTERSERVICE_WSDL_LOCATION = "https://localhost:41734/Muhimbi.DocumentConverter.WebService/?wsdl"; public static void main(String[] args) { try { if (args.length != 1) { System.out.println("Please specify a single file name on the command line."); } else { // ** Process command line parameters String sourceDocumentPath = args[0]; File file = new File(sourceDocumentPath); String fileName = getFileName(file); String fileExt = getFileExtension(file); System.out.println("Converting file " + sourceDocumentPath); // ** Initialise Web Service DocumentConverterService_Service dcss = new DocumentConverterService_Service( new URL(DOCUMENTCONVERTERSERVICE_WSDL_LOCATION), new QName("https://tempuri.org/", "DocumentConverterService")); DocumentConverterService dcs = dcss.getBasicHttpBindingDocumentConverterService(); // ** Only call conversion if file extension is supported if (isFileExtensionSupported(fileExt, dcs)) { // ** Read source file from disk byte[] fileContent = readFile(sourceDocumentPath); // ** Converting the file OpenOptions openOptions = getOpenOptions(fileName, fileExt); ConversionSettings conversionSettings = getConversionSettings(); byte[] convertedFile = dcs.convert(fileContent, openOptions, conversionSettings); // ** Writing converted file to file system String destinationDocumentPath = getPDFDocumentPath(file); writeFile(convertedFile, destinationDocumentPath); System.out.println("File converted sucessfully to " + destinationDocumentPath); } else { System.out.println("The file extension is not supported."); } } } catch (IOException e) { System.out.println(e.getMessage()); } catch (DocumentConverterServiceGetConfigurationWebServiceFaultExceptionFaultFaultMessage e) { printException(e.getFaultInfo()); } catch (DocumentConverterServiceConvertWebServiceFaultExceptionFaultFaultMessage e) { printException(e.getFaultInfo()); } } public static OpenOptions getOpenOptions(String fileName, String fileExtension) { ObjectFactory objectFactory = new ObjectFactory(); OpenOptions openOptions = new OpenOptions(); openOptions.setOriginalFileName(objectFactory.createOpenOptionsOriginalFileName(fileName)); openOptions.setFileExtension(objectFactory.createOpenOptionsFileExtension(fileExtension)); return openOptions; } public static ConversionSettings getConversionSettings() { ConversionSettings conversionSettings = new ConversionSettings(); conversionSettings.setQuality(ConversionQuality.OPTIMIZE_FOR_PRINT); conversionSettings.setRange(ConversionRange.ALL_DOCUMENTS); conversionSettings.getFidelity().add("Full"); conversionSettings.setFormat(OutputFormat.PDF); return conversionSettings; } public static String getFileName(File file) { String fileName = file.getName(); return fileName.substring(0, fileName.lastIndexOf('.')); } public static String getFileExtension(File file) { String fileName = file.getName(); return fileName.substring(fileName.lastIndexOf('.') + 1, fileName.length()); } public static String getPDFDocumentPath(File file) { String fileName = getFileName(file); String folder = file.getParent(); if (folder == null) { folder = new File(file.getAbsolutePath()).getParent(); } return folder + File.separatorChar + fileName + '.' + OutputFormat.PDF.value(); } public static byte[] readFile(String filepath) throws IOException { File file = new File(filepath); InputStream is = new FileInputStream(file); long length = file.length(); byte[] bytes = new byte[(int) length]; int offset = 0; int numRead; while (offset < bytes.length && (numRead = is.read(bytes, offset, bytes.length - offset)) >= 0) { offset += numRead; } if (offset < bytes.length) { throw new IOException("Could not completely read file " + file.getName()); } is.close(); return bytes; } public static void writeFile(byte[] fileContent, String filepath) throws IOException { OutputStream os = new FileOutputStream(filepath); os.write(fileContent); os.close(); } public static boolean isFileExtensionSupported(String extension, DocumentConverterService dcs) throws DocumentConverterServiceGetConfigurationWebServiceFaultExceptionFaultFaultMessage { Configuration configuration = dcs.getConfiguration(); final JAXBElement<ArrayOfConverterConfiguration> converters = configuration.getConverters(); final ArrayOfConverterConfiguration ofConverterConfiguration = converters.getValue(); final List<ConverterConfiguration> cList = ofConverterConfiguration.getConverterConfiguration(); for (ConverterConfiguration cc : cList) { final List<String> supportedExtension = cc.getSupportedFileExtensions().getValue().getString(); if (supportedExtension.contains(extension)) { return true; } } return false; } public static void printException(WebServiceFaultException serviceFaultException) { System.out.println(serviceFaultException.getExceptionType()); JAXBElement<ArrayOfstring> element = serviceFaultException.getExceptionDetails(); ArrayOfstring value = element.getValue(); for (String msg : value.getString()) { System.out.println(msg); } } }
Clavin is a Microsoft Business Applications MVP who supports 1,000+ high-level enterprise customers with challenges related to PDF conversion in combination with SharePoint on-premises Office 365, Azure, Nintex, K2, and Power Platform mostly no-code solutions.