Spring Reactive Microservices: A Showcase

Introduction and Scope

The Servlet Specification was built with the blocking semantics or one-request-per-thread model. Usually, in a cloud environment, the machines are smaller than traditional data centers. Instead of a big machine, it is popular to use many small machines and try to scale applications horizontally. In this scenario, the servlet spec can be switched to an architecture created upon Reactive Streams. This kind of architecture fits better than servlet for the cloud environments. Spring Framework has been creating the Spring WebFlux to helps developers to create Reactive Web Applications [1].

Spring WebFlux, which is based on Project Reactor, allows us to:

In our demo project we will center around the reactive WebClient component making calls to remote services. This is actually a good starting point and a pretty common case. As stated in [2], the greater the latency per call or the inter-dependency among calls, the more dramatic the performance benefits are.

An extra motivation for this approach is the fact that since Spring version 5.0, the org.springframework.web.client.RestTemplate class is in maintenance mode, with only minor requests for changes and bugs to be accepted going forward. Therefore, it is advised to start using the org.springframework.web.reactive.client.WebClient which has a more modern API. Moreover, it supports sync, async, and streaming scenarios.

We will build a sample application based on a minimal Microservices architecture and demonstrate several capabilities driven by the requirements of each use-case. Apart from Java understanding, a familiarity with Spring Cloud project is required, basic knowledge of Project Reactor (what is a Mono, what is a Flux) and the basics of Spring WebClient. 

We will follow a business problem-solution approach to make things more realistic. Although we do not intend all reactive APIs,  it should be enough to give you a good idea of what lies ahead if you are about to enter this domain and the different mentality required. And this is something that simple "hello-world" examples with Mono & Flux cannot simply capture. 

Reactive programming is trending at the moment but trivial "hello-world" examples with Mono & Flux cannot simply capture the demands of building 12-factor apps for production usage. You will need much more than this if you are to make a decision for real-world production systems 

With the reactive WebClient we can return reactive types (e.g. Flux or Mono) directly from Spring MVC controller methods. Spring MVC controllers can call other reactive components too. A mix is also possible in case we have some endpoints and services which cannot become reactive for a number of reasons such as: blocking dependencies with no reactive alternatives or we may have an existing legacy app which we want to migrate gradually etc. In our case we will follow the Annotated Controllers programming model.

Scenario / Project Structure

We are going to implement a simplified One Time Password (OTP) service, offering the following capabilities:

Our application will consist of the following microservices:

A number of remote (external) services will be invoked. We assume that our application is authorized to use them will access them via their REST API. Of course, these will be mocked for simplicity. These "3rd-party" services are:

In order to simulate a microservices setup, we will use Spring Cloud with HashiCorp Consul for service discovery and Spring Cloud Gateway. There is no particular reason for not going with Eureka, just have in mind that Consul can play the role of a centralized configuration server as well while with Eureka only we need to have a separate Spring Cloud Config server.

We choose Spring Cloud Gateway instead of Zuul for the following reasons:

We will go with Spring Cloud Loadbalancer (instead of Ribbon) for client-side load balancing and with @LoadBalanced WebClient (instead of Feign) for service-to-service communication. Apart from this, each microservice will be based on Spring Boot. We will also bring Spring Data R2DBC into play to integrate with a PostgreSQL database using a reactive driver. A diagram of our components is shown below:

One Time Password as a Service

The full source code can be inspected at https://github.com/kmandalas/webclient-showcase. 

I. Generate OTP

Business Requirement

Given the number of a user in E.164 format:

  1. fetch customer data from customer-service and in parallel validate the number status using the number-information service
  2. produce an OTP pin and save it in the DB
  3. invoke the notification-service to deliver it
  4. return response

Solution

You can see the full implementation here

Java


First of all, we see that we need to communicate with 1 internal microservice (service-to-service communication) and with 2 external (remote) services.

As already mentioned, we choose to go with @LoadBalanced WebClient. Therefore, we need to have a loadbalancer implementation in the classpath. In our case, we have added the spring-cloud loadbalancer dependency to the project. This way, a ReactiveLoadBalancer will be used under the hood. 

Alternatively, this functionality could also work with spring-cloud-starter-netflix-ribbon, but the request would then be handled by a non-reactive LoadBalancerClient. Plus, like we said, Ribbon is already in maintenance mode, so it is not recommended for new projects [3].

One more thing we need, is to disable Ribbon in the application properties of our services:

YAML


Finally a note about Feign which was a quite popular choice till now along with Ribbon: the OpenFeign project does not currently support reactive clients, neither does Spring Cloud OpenFeign. Therefore, we will not use it. For more details check here.

Now, here are a couple of practical issues that one may face with real-world applications:

  1. the need for Multiple WebClient Objects.
  2. to propagate a JWT token in case we have our various endpoints protected.

To deal with the 1st issue, we will declare 2 different WebClient Beans inside our WebClientConfig class. This is necessary since service-discovery and load-balancing is only applicable to our own domain and services. Therefore, we need to use different instances of WebClient Beans which of course may have additional differences in configuration (e.g. timeouts) than the @LoadBalanced annotation.

For the 2nd issue we need to propagate the access token within the header attribute of the WebClient:

Java


In the snippet above, we assume we have a utility method that gets a JWT token from the incoming request forwarded via Spring Cloud Gateway to the otp-service. We use this in order to set the "Authorization" header attribute with the value of the Bearer token effectively passing it on to the customer-service. Keep in mind that the following settings are also needed in the application.yml of the gateway-service in order to allow this relay:

YAML


Now that we have these sorted out, let's see which Reactor Publisher functions we can use to get the result:

For a full list of options you may check the Mono API.

II. Validate OTP

Business Requirement

Given an existing OTP id and a valid pin previously delivered to user's device:

  1. fetch the corresponding OTP record from the DB by querying "otp" table by id
  2. if found, fetch information of maximum attempts allowed from configuration table "application", otherwise return error
  3. perform validations: check if maximum attempts exceeded, check for matching pin, if OTP has expired etc.
  4. if validation checks fail, then return error, otherwise update the OTP status to VERIFIED and return success
  5. in case of error we need to finally save the updated counter of maximum attempts and OTP status back to the database

We assume here that we can have OTPs associated with applications and we can have different time-to-live periods, different number of maximum attempts allowed etc. We keep these configuration data in a second DB table named "application".

Solution

You can check the full implementation here:

Java

Let us pause for a moment here and notice that inside our doOnError method in the end, we call the subscribe method. If you check the Reactor documentation, the subscribe method and its variants are usually used to trigger the execution of a reactive chain of actions. But so far, we did not have it anywhere in the code. And we did not need it since we return either Mono or Flux all the way back to our Rest Controllers. They are the ones that perform the subscribe for us, behind the scenes. And as Rossen Stoyanchev says in his must-watch presentation Guide to "Reactive" for Spring MVC Developers, "you should keep the flight going" i.e. if possible not block and return Reactive types from your endpoints. On the other hand, we need to use the subscribe method inside the doOnError because there we do not return anything, so we need to trigger somehow our Reactive repository to execute that update.

III. Resend OTP

Business Requirement

Given Given an existing OTP id give the possibility to be re-sent to multiple channels (SMS, e-mail, Viber etc.) and concurrently (in parallel):

  1. fetch the corresponding OTP record from the DB by querying "otp" table by id
  2. if not found or its status is no longer valid (e.g. EXPIRED), return error
  3. if found, proceed re-sending it to the customer via multiple channels and simultaneously via the notification-service
  4. Return to the caller the OTP sent

Solution

You can check the implementation here:

Java


Our focus on this one is to demonstrate parallel calls the the same endpoint. In our first use case we saw how we can make parallel calls to different endpoints, returning different types using Mono.zip. Now we will use Flux.merge instead.

Let's see how we approached this:

IV. Get All OTPs and OTP Status

Business Requirement

Get back a list of OTPs for a given MSISDN and a single OTP by id for checking its status.

Solution

You can check the implementation here.

We left the more straightforward cases for the end since you will find many examples simply returning Flux or Mono usually by querying a Relational of NoSQL database with reactive drivers support. Keep in mind though that in case of relational databases, Spring Data R2DBC (R2DBC stands for Reactive Relational Database Connectivity) does not offer many features of ORM frameworks (JPA, Hibernate). Its primary target is to be a simple, limited, opinionated object mapper. So, if you are used in your past projects to JPA and Hibernate then prepare for a mind shift about this part as well.

Other Topics

Logging

Logging is an important aspect for every kind of software. Solutions based on Microservices architectures have additional demands for centralized logging. However, when we are using File Appenders for logging then we have an issue since this I/O operation is blocking. See the following issue for an example:

A solution is to select and configure Async Appenders which seem to be supported by major SLF4J implementations like Log4j and Logback. In our example we go with the Logback AsyncAppender. An example configuration can be seen here.

The AsyncAppender has five (5) configuration options:

Distributed Tracing

Tracing is another vital aspect of Microservices monitoring. We can trace all calls that are made from/to the microservices, using Spring Cloud Sleuth and Jaeger.

Sleuth offers a convenient auto-configuration that works out-of-the-box with popular frameworks like Spring MVC and WebFlux. It allows injecting trace and span IDs automatically and displaying this information in the logs, as well as annotation-based span control. To make it work with Jaeger, we need to enable the Zipkin collector port in Jaeger's configuration.

One thing to have in mind is that limitations do exist here as well. For example, tracing database calls with R2DBC is not yet supported. You may find the related issue here:

The following is screenshot of Jaeger UI homepage:

Jaeger home page


This is an example of tracing the call which generates OTPs:

tracing the call which generates OTPs


This is an example of tracing the call which resends an OTP via multiple channels in parallel:

tracing call that sends OTP

Reactive Types Support for @Cacheable Methods

Spring's @Cacheable annotation is a convenient approach to handle caching usually at the services level. This cache abstractions works seamlessly with various caching implementations including JSR-107 compliant caches, Redis etc. However, at the moment of writing there is still no Reactive types support for @Cacheable methods. The related is issue is:

And although Redis is a pretty common centralized cache solution, and a Reactive driver for Redis exists in the Spring Data project, there is no plan at the moment to add a reactive cache implementation:

Handling Special Cases

On a typical Spring WebFlux server (Netty), you can expect one thread for the server and several others for request processing which are typically as many as the number of CPU cores. Although WebClient does not block the thread, sometimes it is desired to use another thread pool than the main worker thread pool shared with the server [5] . Such cases may be:

For this purpose, Spring WebFlux provides thread pool abstractions called Schedulers. These are combined with Task Executors where we can create different concurrency strategies, set minimum and maximum number of threads, etc.

Java


Let's now imagine that we want to check the status of five MSISDNs simultaneously and return the result as a list of NotificationResultDTOs:

Java

For a full list of options you may check the ParallelFlux API.

BlockHound

The change from blocking to non-blocking code or from imperative to reactive programming is tricky and demands to build up a level of experience in order to make your self comfortable. Sometimes it may be hard to detect blocking code in Reactor thread. And this is because we don't need to use block to make things blocking but we can unconsciously introduce blocking by using a library which can block the current thread.

A useful utility to helps us detect some cases is BlockHound. It is sponsored by Pivotal and it can be used in various ways, but we recommend to used during the test phase. The only thing you need for this is to include the following dependency:

XML


Keep in mind that if go with Java 11 and above, the following JVM argument is needed for the tool to work properly:

-XX:+AllowRedefinitionToAddDeleteMethods

Integration Testing

In our sample project we show an example Integration Test covering our most "complicated" endpoint which is the one that generates an OTP. We use HoverFly for mocking responses of the two "external" services (i.e. number-information and notification-service) and the call to our "internal" service (i.e. customer-service). We also use Testcontainers for spinning-up a dockerized PostgresDB during the test's execution.

The full code can be seen in OTPControllerIntegrationTests class.

We also use WebTestClient which is a Client for testing web servers that uses WebClient internally to perform requests while also providing a fluent API to verify responses. This client can connect to any server over HTTP, or to a WebFlux application via mock request and response objects.

One thing worthwhile to mention is the "trick" we perform in order to simulate the existence of customer-service service instance. A ServiceInstance represents an instance of a service in a discovery system. When running integration tests, we usually have part of cloud features disabled and service discovery is one of them. However, since we use @LoadBalanced WebClient when we invoke the customer-service during the integration flow we test, we need a way to simulate a "static" instance of this service. Moreover, we need to "bind" it with HoverFly so when it is invoked to return the mocked response we want. This is achieved with the
gr.kmandalas.service.otp.OTPControllerIntegrationTests.TestConfig static class.

Async SOAP

Nowadays most of the systems we integrate with expose REST endpoints. Its not uncommon however to still must integrate with SOAP-based web services. Both JAX-WS and ApacheCXF allow the generation of non-blocking clients. You may find an example of how to deal with such case at Reactive Web Service Client with JAX-WS.

How to Run

To build and test the application, the prerequisites are:

Then simply execute from a terminal:

mvn clean verify

The easiest way is to run the microservices using Docker and Docker Compose:

docker-compose up --build

When the containers are up and running, you can visit consul's UI to see the active services:

http://localhost:8500/ui/dc1/services

Services running via consul UI

Below, you may find curl commands for invoking the various endpoints via our API Gateway:

Generate OTP

PowerShell


Validate OTP

PowerShell


Resend OTP

PowerShell


Get All OTPs

PowerShell


OTP Status

PowerShell

Conclusion

Performance has many characteristics and meanings. Reactive and non-blocking generally do not make applications run faster. They can, in some cases, for example when using the WebClient to run remote calls in parallel while at the same time avoiding getting involved with Task Executors and use a more elegant and fluent API instead. It comes of course with a significant learning curve.

The key expected benefit of reactive and non-blocking is the ability to scale with a small, fixed number of threads and less memory. That makes applications more resilient under load because they scale in a more predictable way. In order to observe those benefits, however, you need to have some latency (including a mix of slow and unpredictable network I/O). That is where the reactive stack begins to show its strengths, and the differences can be dramatic [8].

Some interesting load testing and comparison results are presented at [9]. The conclusion is that Spring Webflux with WebClient and Apache clients "win" in all cases. The most significant difference (4 times faster than blocking Servlet) comes when underlying service is slow (500ms). It is 15–20% faster than non-blocking Servlet with CompetableFuture. Also, it does not create a lot of threads comparing with Servlet (20 vs 220).

So, if you are up to start building Reactive Microservices with Spring and eager to take advantage of the benefits above, we hope we gave a quite good picture of the challenges that lie ahead and the amount of preparation needed.

 

 

 

 

Top