Reactive Microservices Done Right!

Introduction

In this article, we will take a look at how effective microservices development can be modeled around the concepts in reactive systems. First, we will look into what a reactive system is, and what is required to become one. Then we will look into how we have applied these concepts into the microservices world, and specifically how Ballerina, a programming language that was designed from the ground up to have first-class support for microservices, is fulfilling this with some easy-to-use abstractions for developers. 

Reactive Systems

A reactive system is created by following a set of guiding principles. These are mainly mentioned in the reactive manifesto. It talks about how to create systems that can take advantage of the latest hardware, and use those resources most efficiently. It basically provides a guideline on creating a better-distributed system. A reactive system is said to be responsive, resilient, elastic, and message-driven. 

value form and means of a reactive systen

Figure 1: Reactive system properties (source: https://www.reactivemanifesto.org/)

We hear a lot about reactive programming, where there are libraries such as RxJava, Project Reactor, and frameworks such as Vert.x and Spring 5 WebFlux which use the reactive programming model to expose their functionality. Reactive programming is basically an asynchronous programming technique, which emphasizes on being non-blocking during the execution of our program. It follows a data flow technique where the execution is advanced as and when data is available, thus the execution is never blocked on resources. The reactive libraries and frameworks mainly help in orchestrating these data flows. 

Reactive programming is primarily implemented using a callback-based approach, or a derivative of it. While providing the benefits of asynchronous execution, it is not always suitable for all use cases and has its limitations. The usage of reactive programming does not mean that you have a reactive system. It can only be an enabler and is not necessarily a requirement. Let’s take a closer look at each of the reactive system properties, the tools and technologies related to it, and also, the Ballerina language’s take on it. 

Responsiveness

A system is responsive when it responds to requests promptly. For example, in the case of a web application, if a user clicked a button to download some data from the Internet and show it on the screen, we would register a callback, show a loading screen at the same time, and return the execution back. At this point, the user can still interact with the screen, cancel this request, or select another option. Earlier, in our button click handler, if we didn’t register a callback and synchronously load the data, then the whole screen would be non-responsive until our operation is done. So this is an example where we have designed our system to be responsive. 

In the domain of microservices and service clients, we should optimally use the processing and I/O resources we have to satisfy this property. A general optimization we do is, we never block a running thread if we do not absolutely have to. This scenario mainly occurs in an I/O bound operation, where we block the executing thread on some type of an I/O operation. Since we typically allocate a fixed amount of threads in the application, by means of a thread pool, there is a chance that we may run out of threads with a high number of users, and this will block new requests from executing. This is an undesirable situation and should be addressed. 

This scenario is generally tackled by using non-blocking I/O with asynchronous programming. Let’s take a look at a use case of looking up hospital information using my IP address. For this, we have two services at our disposal, geoIpSvc, and hospitalInfoSvc. They are used to look up the geographical location of the given IP address and to look up a hospital that is in the given zip code, respectively. Also, we have the requirement of implementing this scenario in a non-blocking manner to promote efficient resource usage. 

Let’s start with our first implementation using Spring Boot with RxJava. 

Spring Boot / RxJava implementation of geoip-hospital lookup

Listing 1: Spring Boot / RxJava implementation of geoip-hospital lookup

Listing 1 shows how we have used a chaining approach to connect the result of one service to the input of a subsequent service call. This is not straightforward at first since we are working with Observable structures. Getting used to the above nested functional composition generally requires a steep learning curve, and becomes harder to handle as we increase the number of layers. Also, in this approach, debugging becomes an increasingly difficult task due to how the Observable instances are executed. It is not straightforward to simply set up a breakpoint and follow the execution. But rather, we generally would put careful trace statements to track the execution of the logic. 

Let’s now take a look at implementing the same use case using Ballerina. 

Ballerina implementation of geoip-hospital lookup

Listing 2: Ballerina implementation of geoip-hospital lookup

The Ballerina code at Listing 2 implements the same functionality as the reactive Java implementation at Listing 1. It’s just two lines, and simply contains the direct calls to the services one after another in getting the required functionality done. We can see that it uses synchronous calls in contacting the remote services. So are they blocking I/O calls? No, actually Ballerina does non-blocking I/O invocations in a transparent manner. This is where the beauty of programming abstractions comes into play. Ballerina has abstracted away from the non-blocking invocations when using remote method calls. Non-blocking I/O is a complicated operation to implement by the developer, thus the operations required to make it work are hidden, providing a much more elegant programming experience for the developer. 

This functionality is made possible due to the underlying concurrency model construct followed by the Ballerina language called strands. A Ballerina program is executed on operating system threads, where a thread contains one or more strands. At a given time, a single strand will be executed. A strand has properties of a coroutine and does cooperate multitasking between the strands in a thread. This basically means, in an event such as an I/O call, the current strand can yield the thread and return it to be executed by another strand. This effectively releases the thread, and our strand remembers the position where it suspended its execution. The I/O operation is handed over to the operating system to be executed by the underlying hardware, and when the I/O operation is completed, the Ballerina execution is notified and the suspended strand is continued, using an available thread, from the location it stopped earlier. 

Due to the inherent non-blocking architecture in Ballerina, it’s primary thread pool used internally for request dispatching can be paired with the physical number of cores available in the machine. This makes sure of optimal resource utilization since it reduces any unwanted thread context switches. This especially has a significant impact on reducing the tail latencies of service requests. This is probably because the context switch overhead is not averaged along with all requests, but it’s hit for a smaller number of requests that are in the processing queue for a thread. For a more in-depth analysis of this, check out the blog written by the Ballerina team’s resident performance expert Malith Jayasinghe. 

In the Java reactive solution, it can only support a callback function based solution for non-blocking operations, since Java does not have native language level features in implementing a coroutine based functionality. Even though there are specific use cases, where the callbacks are useful, for example, as we saw in UI programming, the direct network I/O operation we had in our scenario is not possible due to a technical limitation in Java at this moment. Java developers can be hopeful with the upcoming Project Loom’s continuations functionality. 

Resilience

The resilience of a system is where it should still respond in the event of possible failures in execution. We should basically expect failures and take proactive measures in handling them. In a microservice architecture, this is especially important because as we move away from monolithic solutions to disaggregated architectures that are network accessible, the unreliable nature of the network is something we cannot ignore. 

For example, the Ballerina language follows the circuit breaker pattern in short-circuiting a failing backend service to immediately respond to the caller until the backend service is healthy again. An example of this can be seen in Listing 3.

HTTP client initialization in Ballerina with a circuit breaker configuration

Listing 3: HTTP client initialization in Ballerina with a circuit breaker configuration

For more information on the resiliency features of Ballerina, check the following examples:

Observability

The observability of microservices is also a critical aspect when anticipating the failure of the system. Ballerina provides inherent support for observability with language-level features such as services, client types, and remote methods. This allows the runtime to automatically monitor these resources with minimal developer intervention. 

Ballerina observability dashboards

Figure 2: Ballerina observability dashboards

The runtime generates metrics data for aggregates such as service request counts, latency percentiles, and error rates. Also, the distributed traces are generated for debugging, and optimizing communication flows for service to service communication. For more in-depth information on automated observability in Ballerina, see this article. 

Elasticity

The elasticity of the system is the ability to spin up new instances of services and also to discover these new endpoints to communicate with them. With the advent of cloud computing platforms, these services have become a commodity now. This has been accelerated with the adoption of container-based technologies like Docker, and container orchestration systems like Kubernetes. 

Kubernetes has become the de-facto standard in container management, and due to its common API in creating deployments, the same application can be deployed in any Kubernetes supported cloud platforms such as AKS, EKS, and GKE.

Due to the inherent properties of containers such as quick deployment, isolation, and lightweight nature, it becomes the ideal technology for executing microservices. Ballerina had made the early adoption of container technologies by allowing developers to generate Docker images and Kubernetes artifacts automatically from the project build itself while following best practices. In Listing 4 we can see a Ballerina service with the Kubernetes annotations that direct the compiler to generate the Docker image and the Kubernetes artifacts required for deployment to a remote cluster.

Ballerina service for Kubernetes deployment

Listing 4: Ballerina service for Kubernetes deployment

Ballerina service build with Kubernetes support

Listing 5: Ballerina service build with Kubernetes support

Using the Kubernetes annotation attached to the Ballerina service, we can configure many facets of the deployment such as service information, config maps, ingress, and HPA. For example, the HPA configuration above provides the horizontal pod autoscaler details, which is used to scale the number of pods based on CPU utilization. 

For more information on Ballerina’s code to cloud story, check here. 

Message Driven

Message-driven microservices

Figure 3: Message-driven microservices

Microservices are mostly accessed either through a REST interface or message passing. A RESTful interface is required if you have a strict request/response synchronous execution. If not, the preferred approach for a microservices architecture is to have a message-driven system. This enforces loose coupling, isolation, and better resilience against failures. 

In using message passing, we make sure future extensions of the system such as adding new services that consume an existing service can be done seamlessly without redeploying any of the existing services in the system. 

Ballerina, aligning with the best practices in microservices architecture, supports many leading messaging solutions in its standard library, such as NATS, Kafka, and RabbitMQ. 

Summary

In this article, we have gone through the concepts found in the reactive manifesto, and how microservices can use its features to make a more efficient distribution system. Ballerina makes developing and deploying reactive microservices a simple task by providing built-in language features and a full platform. 

For more information on writing microservices in Ballerina, check out the following resources: 

 

 

 

 

Top