Spring Webflux Multipart File Upload and Reading Each Line Without Saving It

2024-12-02

I’ve been working on Spring Webflux for a while. And in my experience, uploading and reading files in this framework is quite a hassle.

Today I am going to talk about uploading a file using Spring Webflux. And the most amazing part is that I am not going to save the file, but I will read it. I will also be checking whether all the data of the file match my RegEx criteria or not using powerful Java stream API.

The Real-Life Problem Description I’ve Faced

The issue I've faced is having to upload any type of file with the condition that the lines of the file have to be separated by a new line. There is no way you can save the file on the server. Create a list of String by reading the file, where each item of the list will be a single line of the file. Each item must have to match a validation rule; otherwise, you'll have to discard the whole file as it is corrupted. So, the summary is: upload -> read -> check -> list of string from the file without saving it.

So, the rough steps are:

Controller to consume the multipart file into Flux FilePart
Converting the file parts into Flux of String using dataBuffer
Collect all the data parts string and process them
Check validity using Java Stream API and regex
And tons of magic

Sound scary? Well, I will explain it to you step by step. So, what are we waiting for? Let’s dig in.

Controller

       Java 
     
            x 
           
// use Flux<FilePart> for multiple file upload
@PostMapping(value = "/upload-flux", consumes = MediaType.MULTIPART_FORM_DATA_VALUE, produces = MediaType.APPLICATION_STREAM_JSON_VALUE)
@ResponseStatus(value = HttpStatus.OK)
public Flux<String> upload(@RequestPart("files") Flux<FilePart> filePartFlux) {     
    
    return uploadService.getLines(filePartFlux);
}

This part is easy. This is a post endpoint that is able to accept multiple files. URL part is upload-flux and must have to use consumes = MediaType.MULTIPART_FORM_DATA_VALUE . As we can see I have used:

      Java 
    
xxxxxxxxxx
 
public Flux<String> upload(@RequestPart("files") Flux<FilePart> filePartFux)

Here, part of the request files will be automatically injected as Flux<FilePart> into the method by Spring Webflux.

Remember:
1. To upload multiple files you must have to use Flux<FilePart>
2. To upload single file you must have to use Mono<FilePart> or FilePart
3. Mono<MultiValueMap<String, Part>> can be used for both case. But in that case you have to find out the FilePart(s) from the map by key.l Like for this tutorial the key is files for both single and multiple file.

For this tutorial I am going to use Flux<FilePart>

Service

From the controller layer, filePartFlux is now passed to the service layer. I have divided the work of this service into two methods. Let’s try to understand these methods one by one.

First Method

       Java 
     
            x 
           
public Flux<String> getLines(Flux<FilePart> filePartFlux) {
    return filePartFlux.flatMap(filePart ->
            filePart.content().map(dataBuffer -> {
                byte[] bytes = new byte[dataBuffer.readableByteCount()];
                dataBuffer.read(bytes);
                DataBufferUtils.release(dataBuffer);
                return new String(bytes, StandardCharsets.UTF_8);
            })
            .map(this::processAndGetLinesAsList)
            .flatMapIterable(Function.identity());
}

In this method the filePartFlux is directly passed from the controller layer. Then we flatmap filePartFlux and get a new Flux<String> stream.

      Java 
    
xxxxxxxxxx
 
filePartFlux.flatMap(filePart ->
    filePart.content().map(dataBuffer -> {
        byte[] bytes = new byte[dataBuffer.readableByteCount()];
        dataBuffer.read(bytes);
        DataBufferUtils.release(dataBuffer);
        return new String(bytes, StandardCharsets.UTF_8);
    }))

filePartFlux will emit filepart into the flatmap. Then we access the content of filepart and map to create a Flux of String. Inside the map, we get dataBuffer which is emitted from the content(). Here we have to keep in mind that a certain amount of bytes are readable from this dataBuffer. So, we take a byte array variable bytes with length of dataBuffer.readableByteCount()

Then we fill the bytes array by reading data from dataBuffer like dataBuffer.read(bytes) . Then we free the dataBuffer by releasing it like DataBufferUtils.release(dataBuffer) . Then we convert the bytes into String and return it. So, when this full process will be completed we will get a new Flux<String> stream. Now let's see the rest of the method.

      Java 
    
xxxxxxxxxx
 
.map(this::processAndGetLinesAsList);
.flatMapIterable(Function.identity());

Now, we get every String from the Flux<String> stream and by processing them via processAndGetLinesAsList method we generate another Flux<String> stream from flatMapIterable(Function.identity()). processAndGetLinesAsList method is described in the next section. For processing and validation check we need another method. After validation check, if the file is corrupted then an empty Flux<String> will be returned from here.

Second Method

       Java 
     
xxxxxxxxxx
 
private List<String> processAndGetLinesAsList(String string) {
    Supplier<Stream<String>> streamSupplier = string::lines;
    var isFileOk = streamSupplier.get().allMatch(line -> 
                         line.matches(MultipartFileUploadUtils.REGEX_RULES));
    return isFileOk ? streamSupplier.get()
                           .filter(s -> !s.isBlank())
                           .collect(Collectors.toList())
                    : new ArrayList<>();
}

This not so scary as it looks like. Just read and translate as it is written here.

In this method, we have added some validation over our data. To do that at first we have to split each string into string::lines

At this point, you might ask why we are doing this. Well, we need every line from the file. But, you know what, as we are getting this string variable’s value from FilePart & DataBuffer , it is not guaranteed that every lines variable from the Flux stream will be a single line from the file. Because we have generated this list from FilePart & dataBuffer, so each string will contain multiple lines from the file as the file is read part by part and the strings are generated from each part respectively.

So, what we have done here is we have created a Supplier that will supply a Stream of string. In the Lambda function, we have made a stream from the splitted string .

The next statement is our validation checkpoint. Here we are checking every string of the stream (eventually that means every line of the uploaded file) against our RegEx rules using Java’s stream API.

      Java 
    
xxxxxxxxxx
 
streamSupplier.get().allMatch(line -> line.matches(Util.YOUR_REGEX))

In short form, this is equivalent to:

      Java 
    
xxxxxxxxxx
 
stream.allMatch(value -> condition)

And this will return true only if all the value of the stream meets the condition successfully. Mind blowing, isn’t it?

So, in our code, if all is well, then the stream will be converted into a list and returned. Otherwise, that means the file’s value(s) is/are against our rules, in other words, it is a corrupted file. And thus empty list will be returned.

And that’s it, the Flux of String is returned all the way through to the client. Enjoy the List of String.

But wait, why on earth would anybody want to upload a file through a RestAPI and get the lines of the file as a response? That doesn’t make sense, right? Maybe you want to trigger some other operation by this list of string. Or, one could publish these string to a message broker queue. Or, what if one wants to save these tremendous numbers of lines in a database like DynamoDB?

Well, that’s a story for another step by step tutorial.

Please share and leave a comment for any questions or feedback.
To see the full tutorial in action with much more ingredients, browse the project on GitHub:
https://github.com/eaiman-shoshi/MultipartFileUpload