How to Change PDF Paper Sizes With an API in Java
Almost every business around the world works with PDF documents daily in some capacity, and that alone establishes the value of leveraging niche technologies to automate unique PDF workflows. The purpose of this article is to demonstrate an efficient web API solution Java developers can use to quickly adjust PDFs between common ISO 216 A-Series paper sizes (A0 to A7).
Before we get to our demonstration, however, we’ll first take a moment to understand the ISO 216 standard, and we’ll briefly review how PDF file structure handles page sizing to make programmatic adjustments possible.
ISO 216 PDF Paper Size Definition
PDF is the standard digital publishing format for content originally created in dozens of other applications, and it’s also frequently used to format and physically print workplace IDs, advertising pamphlets, and many other materials.
There’s a science to ensuring PDF content corresponds with standard physical printing materials, and that science is laid out by the International Organization for Standardization (ISO). The 216th standard published by ISO defines a variety of PDF paper sizes stemming from a common aspect ratio, which is the square root of 2. These different paper sizes — categorized as A-Series, B-Series, and C-Series — make it easy to scale documents without altering the layout, and, most importantly, they help ensure compatibility between different devices (i.e., printers and copiers) across the world. A-Series, broken down into A0 (largest) to A7 (smallest), is by far the most used paper size series across the world. A4 (210 x 297 mm) is the standard, default letter size we’ll find in use for most PDF documents.
PDF File Structure: Defining Page Size
While it’s easier for us humans to think of paper sizes as an arbitrary range between A0 and A7, computers don’t need to see it that way. As far as our PDF documents are concerned, the size of each page in a PDF is simply specified in the MediaBox
entry of the page object dictionary. This “box” is an array that seeks to define the boundaries of the physical medium, rather than the digital medium, on which any given page is intended to be displayed or printed. Each corner of each page within the MediaBox
is defined by a number, and these numbers correspond to a point system (each point = 1/72 of an inch) that collectively defines the page matrix. To put this in context, the standard A4 letter size (210 x 297 mm) mentioned earlier is defined in the MediaBox
as `[0, 0, 595, 842]`
because 210 mm = 595 points and 297 mm = 842 points.
So, when we make programmatic paper size changes to PDF documents, we need to navigate the PDF file structure to the MediaBox
array, and from there, we need to determine the target ISO size by converting the A-Series millimeter definitions into the points-based coordinates the document is prepared to understand.
Open-Source Solution: Change Paper Size
Of course, as usual, we (thankfully) don’t have to write super complex programs from scratch to handle these steps. If we want to go the open-source route, we can use something like Apache PDFBox
— a popular library for manipulating PDFs in a variety of ways, including changing PDF paper sizes — and leverage the PDRectangle
class to interact with the MediaBox
and thereby define the size of our PDF pages. PDRectangle
lets us handle this whole operation with a fairly minimal amount of code: the only hangup we might encounter is how memory is managed in that process. PDFs we’re planning to use for physical printing tend to be large files, and we might find that processing such files at scale in local memory buffers burns up more of our resources than we’re willing to commit. This is where a web API can step in and offer some additional flexibility; it can both simplify the operation and reduce the local processing power required to get the job done.
Web API Solution: Change Paper Size
By using a web API to handle our PDF paper size adjustments, we can offload the bulk of our burdensome PDF file processing to an external cloud-hosted endpoint, and we can simply download the result of the operation when it’s finished. We can also avoid invoking independent classes from a library altogether, instead leveraging simple, readable, and intuitively defined variables limited specifically to handling paper-size operations.
In the below demonstration, we’ll walk through each step required to call a specialized web API that lets us adjust paper size between A0 and A7 with simple string inputs (e.g., entering “A5” changes all page sizes in the document to ISO standard 148 x 210 mm). This is a free solution, and it only requires a free API key to use in perpetuity (800 API calls per month).
Step 1: Install the Maven SDK
To begin structuring our API call, we’ll first need to add repository and dependency information to our pom.xml
file (Jitpack
is used to dynamically compile the library).
Let’s add the following repository reference:
<repositories>
<repository>
<id>jitpack.io</id>
<url>https://jitpack.io</url>
</repository>
</repositories>
And then let’s add the following dependency reference:
<dependencies>
<dependency>
<groupId>com.github.Cloudmersive</groupId>
<artifactId>Cloudmersive.APIClient.Java</artifactId>
<version>v4.25</version>
</dependency>
</dependencies>
Step 2: Add the Import Statements
We’ll now add the following imports:
// Import classes:
//import com.cloudmersive.client.invoker.ApiClient;
//import com.cloudmersive.client.invoker.ApiException;
//import com.cloudmersive.client.invoker.Configuration;
//import com.cloudmersive.client.invoker.auth.*;
//import com.cloudmersive.client.EditPdfApi;
Step 3: Configure API Key Authorization
With the below snippet, we’ll set up the API client and configure our API key:
ApiClient defaultClient = Configuration.getDefaultApiClient();
// Configure API key authorization: Apikey
ApiKeyAuth Apikey = (ApiKeyAuth) defaultClient.getAuthentication("Apikey");
Apikey.setApiKey("YOUR API KEY");
// Uncomment the following line to set a prefix for the API key, e.g. "Token" (defaults to null)
//Apikey.setApiKeyPrefix("Token");
Step 4: Instance the API
In our final step, we’ll create an instance of the API, define our input file and desired paper size (remember, these are values A0 through A7), and call the API:
EditPdfApi apiInstance = new EditPdfApi();
File inputFile = new File("/path/to/inputfile"); // File | Input file to perform the operation on.
String paperSize = "paperSize_example"; // String | The desired paper size for the resized PDF document. Size ranges from A7 (smallest) to A0 (largest).
try {
byte[] result = apiInstance.editPdfResize(inputFile, paperSize);
System.out.println(result);
} catch (ApiException e) {
System.err.println("Exception when calling EditPdfApi#editPdfResize");
e.printStackTrace();
}
The try
-catch
block ensures our program will handle any potential errors gracefully. We’ll get informative messages and stack traces to diagnose and resolve issues in our operation.
Conclusion
In this article, we reviewed the relevance of ISO 216 paper size definitions, discussed how PDF file structure stores and represents paper size information independently of ISO A, B, and C Series definitions, and then looked at two solutions (one open-source and one independent web API) for programmatically adjusting PDF A-Series paper sizes.