How to Convert HTML to PDF in Java

2024-12-02

There are many reasons why PDF is the most commonly used document format in the world. For one, its level of compatibility is unmatched — PDFs can be viewed with perfect fidelity on PC, Mac, Linux, web browsers, and mobile platforms with no problems whatsoever. Add to this its print quality and immutability, and you have a clear go-to choice when it comes to convenience.

Turning now to the matter of conversion, however, you start to run into some problems. There is no clear and simple means by which you can directly create a PDF from HTML code with Java. Instead, a whole process of parsing and rendering must first be performed, which is about as much fun as it sounds. So how can we achieve the high quality results that we require without wasting a ton of development hours on the problem?

Today, we will be looking at how to accomplish this quickly and easily through the use of an API. After just a few simple setup steps, we will be able to perform a variety of useful functions relating to easing the transition between HTML and PDF:

HTML document to PDF.
HTML string to PDF.
URL to PDF.
Editing PDFs.

One particularly important goal for these operations will be to maintain a high level of accuracy when making the transition between the two formats. Advanced design elements including CSS, Javascript, and images will all be preserved post-conversion. One detail to bear in mind, images should be included as absolute URLs or in base 64 inline form.

Without further ado, let's dive straight in.

We begin with our library installation, which will require first a repository reference for our Maven POM file

      XML 
    
xxxxxxxxxx

<repositories>
    <repository>
        <id>jitpack.io</id>
        <url>https://jitpack.io</url>
    </repository>
</repositories>

That will allow Jitpack to dynamically compile our library. Second, we will also need our dependency reference in there as well:

      XML 
    
xxxxxxxxxx

<dependencies>
<dependency>
    <groupId>com.github.Cloudmersive</groupId>
    <artifactId>Cloudmersive.APIClient.Java</artifactId>
    <version>v3.54</version>
</dependency>
</dependencies>

Next, let us turn our attention to our controller. We will first need our imports to be added to the top of the file.

      Java 
    
xxxxxxxxxx

// Import classes:
import com.cloudmersive.client.invoker.ApiClient;
import com.cloudmersive.client.invoker.ApiException;
import com.cloudmersive.client.invoker.Configuration;
import com.cloudmersive.client.invoker.auth.*;
import com.cloudmersive.client.ConvertDocumentApi;

And now we can call our function, so let’s have a look at this example code below:

      Java 
    
 
 
       
     
xxxxxxxxxx

               
           
ApiClient defaultClient = Configuration.getDefaultApiClient();
 
// Configure API key authorization: Apikey
ApiKeyAuth Apikey = (ApiKeyAuth) defaultClient.getAuthentication("Apikey");
Apikey.setApiKey("YOUR API KEY");
// Uncomment the following line to set a prefix for the API key, e.g. "Token" (defaults to null)
//Apikey.setApiKeyPrefix("Token");
 
ConvertDocumentApi apiInstance = new ConvertDocumentApi();
File inputFile = new File("/path/to/file"); // File | Input file to perform the operation on.
try {
    byte[] result = apiInstance.convertDocumentHtmlToPdf(inputFile);
    System.out.println(result);
} catch (ApiException e) {
    System.err.println("Exception when calling ConvertDocumentApi#convertDocumentHtmlToPdf");
    e.printStackTrace();
}

To make this work, we will need to ensure the following things:

Provide a valid HTML document as an inputFile.
Call the function convertDocumentHtmlToPdf using our API instance.
Set the API key, which is available from the Cloudmersive website for free (forever), allowing up to 1,000 calls across all available APIs.

And just like that, you are already set up. Note that the above function is designed to work with HTML documents. So, what if we have an HTML string instead? The process is essentially the same, but we will be calling a different function, which is part of the ConvertWebApi. This means we will need to change/add to our imports to reflect this:

      Java 
    
xxxxxxxxxx

// Import classes:
import com.cloudmersive.client.invoker.ApiClient;
import com.cloudmersive.client.invoker.ApiException;
import com.cloudmersive.client.invoker.Configuration;
import com.cloudmersive.client.invoker.auth.*;
import com.cloudmersive.client.ConvertWebApi;

Now we can call convertWebHtmlToPdf:

     Java 
   
 
 
      
    
xxxxxxxxxx

              
          
ApiClient defaultClient = Configuration.getDefaultApiClient();
 
// Configure API key authorization: Apikey
ApiKeyAuth Apikey = (ApiKeyAuth) defaultClient.getAuthentication("Apikey");
Apikey.setApiKey("YOUR API KEY");
// Uncomment the following line to set a prefix for the API key, e.g. "Token" (defaults to null)
//Apikey.setApiKeyPrefix("Token");
 
ConvertWebApi apiInstance = new ConvertWebApi();
HtmlToPdfRequest input = new HtmlToPdfRequest(); // HtmlToPdfRequest | HTML to PDF request parameters
try {
    byte[] result = apiInstance.convertWebHtmlToPdf(input);
    System.out.println(result);
} catch (ApiException e) {
    System.err.println("Exception when calling ConvertWebApi#convertWebHtmlToPdf");
    e.printStackTrace();
}

The key difference here is that instead of inputting an HTML file, we will add the HTML string as part of our HtmlToPdfRequest object. Everything else is the same as before and just as simple to get off the ground. Within this API, there are also related functions that allow you to convert the input HTML into a PNG image, a DOCX document, or a plain text string.

Let us move on to look at creating PDFs from websites directly, using URLs. We will be using ConverWebApi again, so make sure it is on your list of imports. The function we need is called convertWebUrlToPdf:

      Java 
    
 
 
       
     
xxxxxxxxxx

               
           
ApiClient defaultClient = Configuration.getDefaultApiClient();
 
// Configure API key authorization: Apikey
ApiKeyAuth Apikey = (ApiKeyAuth) defaultClient.getAuthentication("Apikey");
Apikey.setApiKey("YOUR API KEY");
// Uncomment the following line to set a prefix for the API key, e.g. "Token" (defaults to null)
//Apikey.setApiKeyPrefix("Token");
 
ConvertWebApi apiInstance = new ConvertWebApi();
UrlToPdfRequest input = new UrlToPdfRequest(); // UrlToPdfRequest | URL to PDF request parameters
try {
    byte[] result = apiInstance.convertWebUrlToPdf(input);
    System.out.println(result);
} catch (ApiException e) {
    System.err.println("Exception when calling ConvertWebApi#convertWebUrlToPdf");
    e.printStackTrace();
}

Similar to the previous function, we create a request object, then pass it our desired URL, and some optional parameters, such as scale factor for the output. Pretty simple. There are also related functions that allow you to create a screenshot image PNG or a text string from a URL.

So what else can this API do with PDFs? If you would like to add some security, you can encrypt your PDF file with a password using this function below:

      Java 
    
 
 
       
     
xxxxxxxxxx

               
           
// Import classes:
//import com.cloudmersive.client.invoker.ApiClient;
//import com.cloudmersive.client.invoker.ApiException;
//import com.cloudmersive.client.invoker.Configuration;
//import com.cloudmersive.client.invoker.auth.*;
//import com.cloudmersive.client.EditPdfApi;
 
ApiClient defaultClient = Configuration.getDefaultApiClient();
 
// Configure API key authorization: Apikey
ApiKeyAuth Apikey = (ApiKeyAuth) defaultClient.getAuthentication("Apikey");
Apikey.setApiKey("YOUR API KEY");
// Uncomment the following line to set a prefix for the API key, e.g. "Token" (defaults to null)
//Apikey.setApiKeyPrefix("Token");
 
EditPdfApi apiInstance = new EditPdfApi();
String ownerPassword = "ownerPassword_example"; // String | Password of a owner (creator/editor) of the PDF file (required)
String userPassword = "userPassword_example"; // String | Password of a user (reader) of the PDF file (optional)
File inputFile = new File("/path/to/inputfile"); // File | Input file to perform the operation on.
String encryptionKeyLength = "encryptionKeyLength_example"; // String | Possible values are \"128\" (128-bit RC4 encryption) and \"256\" (256-bit AES encryption).  Default is 256.
Boolean allowPrinting = true; // Boolean | Set to false to disable printing through DRM.  Default is true.
Boolean allowDocumentAssembly = true; // Boolean | Set to false to disable document assembly through DRM.  Default is true.
Boolean allowContentExtraction = true; // Boolean | Set to false to disable copying/extracting content out of the PDF through DRM.  Default is true.
Boolean allowFormFilling = true; // Boolean | Set to false to disable filling out form fields in the PDF through DRM.  Default is true.
Boolean allowEditing = true; // Boolean | Set to false to disable editing in the PDF through DRM (making the PDF read-only).  Default is true.
Boolean allowAnnotations = true; // Boolean | Set to false to disable annotations and editing of annotations in the PDF through DRM.  Default is true.
Boolean allowDegradedPrinting = true; // Boolean | Set to false to disable degraded printing of the PDF through DRM.  Default is true.
try {
    byte[] result = apiInstance.editPdfSetPermissions(ownerPassword, userPassword, inputFile, encryptionKeyLength, allowPrinting, allowDocumentAssembly, allowContentExtraction, allowFormFilling, allowEditing, allowAnnotations, allowDegradedPrinting);
    System.out.println(result);
} catch (ApiException e) {
    System.err.println("Exception when calling EditPdfApi#editPdfSetPermissions");
    e.printStackTrace();
}

Notice that with the various parameters, you can achieve a high level of control over the various permissions, such as printing, editing, and content extraction. You can also set the length of the encryption key and the password itself. The reverse operation is also available through editPdfDecrypt, allowing you to remove password protection and unlock your PDF files. Within this API, there also exist functions to get and set PDF metadata, transfer pages between PDF documents, and edit annotations as well as form fields.