Programmatically Converting Web / HTML Pages to PDF Format
As part of our on-going series about new features in the PDF Converter for SharePoint On-Premises 4.0 and PDF Converter Services, we would like to showcase our exciting new HTML to PDF conversion functionality.
Please note that this article mentions SharePoint as well as .NET a number of times. Rest assured that, as the PDF Converter Services is Web Services based, it works just as well from Java, C#, Ruby and other web services capable environments.
We anticipate that most of our customers will use this functionality to convert SharePoint pages, including lists, to PDF format. However, rather than displaying a boring old SharePoint site, let’s show how well this works with a real website, in this case one of our landing pages.
UPDATE: A workflow activity is now available as well for converting HTML to PDF as is an update for the SharePoint User interface to convert SharePoint pages to PDF format.
The following image shows the original HTML page on the left hand side and the converted PDF file on the right. As you can see this works very well.
Example of the original web page (left) and the converted PDF file (right)
A summary of the new HTML features are as follows. Although this new functionality is available in both the PDF Converter Services as well as the PDF Converter for SharePoint On-Premises, some of the more SharePoint centric features in the list are obviously exclusive to the SharePoint version.
-
Built on top of Muhimbi’s rock solid service platform. No need to worry about runaway or orphaned processes. Everything is nicely controlled and scales over multiple CPUs and conversion servers.
-
Integrates with the full feature set of Muhimbi’s PDF Conversion platform including full control over watermarks as well as PDF Security settings.
-
High fidelity conversion (See image above) including multi page documents and JavaScript output. The generated PDF file contains real (searchable) text and is not just a low resolution screenshot of the converted web page.
-
Supports conversion by URL as well as manually specified HTML fragments. Ideal for creating PDF based reports using generated HTML tables.
-
Convert HTML documents stored inside SharePoint document libraries.
-
Convert SharePoint pages to PDF format from the user’s Personal Actions menu.
-
Convert web pages to PDF format from SharePoint workflows. Works great in combination with publishing sites.
HTML to PDF Conversion is accessible via the web services based interface as well. Listed below is a simple C# example of how to carry out a conversion from your own code. The code is not complete as it calls into some shared functions from our main C# example to keep things short.
Our existing Java based examples can easily be extended to carry out the same type of conversions. Contact us if you need a hand, we love to help and are very responsive.
/// <summary> /// Simple sample to convert either a URL or HTML code fragment to PDF format /// </summary> /// <param name="htmlOnly">A flag indicating if an HTML Code fragment (true) /// or URL (false) should be converted.</param> private void ConvertHTML(bool htmlOnly) { DocumentConverterServiceClient client = null; try { string sourceFileName = null; byte[] sourceFile = null; client = OpenService("https://localhost:41734/Muhimbi.DocumentConverter.WebService/"); OpenOptions openOptions = new OpenOptions(); //** Specify optional authentication settings for the web page openOptions.UserName = ""; openOptions.Password = ""; if (htmlOnly == true) { //** Specify the HTML to convert sourceFile = System.Text.Encoding.UTF8.GetBytes("Hello <b>world</b>"); } else { // ** Specify the URL to convert openOptions.OriginalFileName = "https://www.muhimbi.com/"; } openOptions.FileExtension = "html"; //** Generate a temp file name that is later used to write the PDF to sourceFileName = Path.GetTempFileName(); File.Delete(sourceFileName); // ** Enable JavaScript on the page to convert. openOptions.AllowMacros = MacroSecurityOption.All; // ** Set the various conversion settings ConversionSettings conversionSettings = new ConversionSettings(); conversionSettings.Fidelity = ConversionFidelities.Full; conversionSettings.PDFProfile = PDFProfile.PDF_1_5; conversionSettings.PageOrientation = PageOrientation.Portrait; conversionSettings.Quality = ConversionQuality.OptimizeForPrint; // ** Carry out the actual conversion byte[] convertedFile = client.Convert(sourceFile, openOptions, conversionSettings); // ** Write the PDF file to the local file system. string destinationFileName = Path.GetDirectoryName(sourceFileName) + @"\" + Path.GetFileNameWithoutExtension(sourceFileName) + "." + conversionSettings.Format; using (FileStream fs = File.Create(destinationFileName)) { fs.Write(convertedFile, 0, convertedFile.Length); fs.Close(); } // ** Display the converted file in a PDF viewer. NavigateBrowser(destinationFileName); } finally { CloseService(client); } }
All in all some pretty exciting functionality. Don’t hesitate to leave a comment below if you have any questions or contact us to discuss any of our products.
Clavin is a Microsoft Business Applications MVP who supports 1,000+ high-level enterprise customers with challenges related to PDF conversion in combination with SharePoint on-premises Office 365, Azure, Nintex, K2, and Power Platform mostly no-code solutions.