Using Spring To Download a Zip File, Extract It, and Upload It to Cloud Storage Without Storing Files Locally in the Container
There are use cases in which, as part of the integration work, you might need to download the zip file from one of your partners, extract the zip file content and then move the extracted files to cloud storage. We had a similar need to download the uploaded ID (Driver’s license) images (Front/Back) from one of the leading ID verification service providers, which persists in the organization’s cloud storage. The challenge lies in downloading a zip file, extracting its contents, and uploading them to cloud storage — all in a transient manner without creating any temporary files on the container of your microservice.
Downloading the Zipped File Content From Third-Party Service
Below is the code snippet to get the zipped file from the partner service.
RestTemplate restTemplate = new RestTemplate();
HttpHeaders headers = new HttpHeaders();
headers.set("Content-Type", "application/zip");
headers.set("Authorization", "Bearer " + accessToken);
HttpEntity request = new HttpEntity(headers);
String url = this.baseUrl + "/documents/" + documentUUID + "?imagequality=original";
ResponseEntity<Resource> response = restTemplate.exchange(url, HttpMethod.GET, request, Resource.class);
In this case, the URL is the path for the partner service where images zipped can be downloaded. This is typical RestTemplate
code that is very common in the Spring ecosystem. There are two things to be noted over here.
Content-Type
— It is of the application/zip type, as the response would be a streamed zip file.RestTemplate
response return type — it is ofResource
type, which is an interface in Spring to represent external resources.
To get the resource object, perform the following steps in the code. See below code snippet.
Resource zipFileContent = response.getBody();
Extracting Zip File Content
Once the resource object is created, then the real work of extracting and moving the extracted files starts. Below is the code snippet for extracting the zip files.
HashMap<String, byte[]> map = new HashMap<>();
if (zipFileContent != null) {
ZipInputStream zipInputStream = new ZipInputStream(zipFileContent.getInputStream());
ZipEntry zipEntry = null;
byte[] buff = new byte[4096];
while ((zipEntry = zipInputStream.getNextEntry()) != null)
{
ByteArrayOutputStream byteArrayOutputStream = new ByteArrayOutputStream();
// consume all the data from this entry
while (zipInputStream.available() > 0) {
int byteLength = 0;
while ((byteLength = zipInputStream.read(buff)) > 0) {
byteArrayOutputStream.write(buff, 0, byteLength);
}
}
byteArrayOutputStream.close();
map.put(zipEntry.getName(), byteArrayOutputStream.toByteArray());
}
zipInputStream.close();
}
The Java class ZipInputStream
is an input stream filter used for reading individual files from a ZIP file. It provides an iterative mechanism to iterate through multiple zip entries. As part of this iteration, we can read individual files into a byte array using ByteArrayOutputStream
. This creates a transient representation of each individual file in the zip file. Each file's byte array is then stored in a HashMap for subsequent uploading to cloud storage.
Now you can iterate through the map to extract and upload individual file to cloud storage.
for (String fileName :
map.keySet()) {
byte[] fileBytes = map.get(fileName);
ByteArrayResource byteArrayResource = this.getDocumentByteArray(fileBytes, fileName);
}
private ByteArrayResource getDocumentByteArray(byte[] bytes, String fileName) {
try {
final ByteArrayResource byteArrayResource = new ByteArrayResource(bytes) {
@Override
public String getFilename() {
return fileName;
}
};
return byteArrayResource;
} catch (Exception ex) {
logger.error("Exception - getDocumentByteArray - Error while getting the uploaded images byte array content, detail error : ", ex);
}
return null;
}
You can then take the Resource
object (ByteArrayResource
) and post it to your internal documents upload API, which will then post it to your respective cloud storage be it AWS or Azure.
public DocumentsResponse uploadDocV1(String accessToken, Resource file) {
RestTemplate restTemplate = new RestTemplate();
HttpHeaders headers = new HttpHeaders();
headers.setContentType(MediaType.MULTIPART_FORM_DATA);
headers.set("Authorization", "Bearer " + accessToken);
MultiValueMap<String, Object> body = new LinkedMultiValueMap<>();
DocumentsResponse result = null;
body.add("file", file);
HttpEntity<MultiValueMap<String, Object>> entity = new HttpEntity<>(body, headers);
String apiEndPoint = this.baseUrl + "/api/documents/v1/upload";
try {
URI uri = new URI(apiEndPoint);
result = restTemplate.postForObject(uri, entity, DocumentsResponse.class);
} catch (Exception ex) {
logger.error("An error occurred while uploading, detail error : ", ex);
}
return result;
}
“/api/documents/v1/upload
" — This is an internal API (microservice) that is responsible for uploading documents to cloud storage.
Document Upload to Cloud Storage
Document upload API accepts the file as MultipartFile
, which then posts that file to AWS S3 bucket.
private String uploadDocument(String s3BucketPath, MultipartFile multipartFile) throws Exception {
try (InputStream documentStream = multipartFile.getInputStream()) {
ObjectMetadata metadata = new ObjectMetadata();
metadata.setContentType(multipartFile.getContentType());
Map<String, String> attributes = new HashMap<>();
attributes.put("document-content-size", String.valueOf(multipartFile.getSize()));
metadata.setUserMetadata(attributes);
PutObjectResult result = this.awsS3Client.putObject(new PutObjectRequest(this.s3bucket,
s3BucketPath, documentStream, metadata));
logger.info("Saved successfully to S3 bucket with keyName={}", s3BucketPath);
return s3BucketPath;
} catch (AmazonS3Exception ex) {
logger.warn("s3Bucket={}. Key={}", s3bucket, s3BucketPath);
if (ex.getErrorCode().equalsIgnoreCase("NoSuchBucket")) {
String msg = String.format("No bucket found with name %s", s3bucket);
logger.error(msg, ex);
throw new DocumentException(true, msg);
} else if (ex.getErrorCode().equalsIgnoreCase("AccessDenied")) {
String msg = String.format("Access denied to S3 bucket %s", s3bucket);
logger.error(msg, ex);
throw new DocumentException(true, msg);
}
logger.error(String.format("Error saving file %s to AWS S3 bucket %s", s3BucketPath, s3bucket), ex);
throw ex;
} catch (IOException ex) {
logger.warn("s3Bucket={}. Key={}", s3bucket, s3BucketPath);
logger.error(String.format("Error saving file %s to AWS S3 bucket %s", s3BucketPath, s3bucket), ex);
throw ex;
}
}
In conclusion, this article has demonstrated how to efficiently handle ZIP files in a Spring-based application without the need for temporary storage. We've walked through the process of downloading a ZIP file from a partner service, extracting the contents, and uploading the files to cloud storage, all without creating temporary files on the microservice container.
Remember, the code provided here is a base from which you can build and adapt to suit your needs. It is important to tailor this to your specific use case, ensuring that it is secure, efficient, and reliable.
For those looking to explore more about Spring or AWS, I would recommend visiting their official documentation. If you're interested in file handling in Java, the Java I/O streams tutorial could be a good starting point.
By understanding and implementing these techniques, you can streamline your data processing tasks and make your applications more efficient and resilient. Happy coding!