Databases in Containers

This post was originally published here.

Database containerization has emerged with various critiques here and there. Data insecurity, specific resource requirements, network problems are often quoted as the significant drawbacks of the practice. Nevertheless, container usage has been on the increase, and so, too, has the method of containerizing databases.

Container usage is now being applied by organizations of all sizes from small startups to huge established microservices platforms. Even prominent database players like Google, Amazon, Oracle, and Microsoft have adopted containerization. This article aims to help beginners navigate the minefield of database containerization and avoid some of the major pitfalls that can occur. Note, we are not recommending its usage, but if you feel the need then hopefully this will help.

But what is database containerization?

Understanding Database Containerization

Database containerization encases databases within a container alongside its operating environment to enable data loading onto a virtual machine and run it independently.

Here are four factors that support the use of the database in containers.

1. Usage of the same configuration or ports for all containers

This setup eliminates some of the overheads that come with a distributed system which supports different nodes types. This distributed system brings about the need for the maintenance of separate containers which also requires multiple configurations. Database containerization supports one kind of configuration.

2. Resilience, resources, and storage

Containers aren’t meant to persist with data inside them. In traditional database scenarios, there is often the need for data replication or for data to be exported from a central storage system. Which makes this process expensive and also significantly slows performance.

Databases act like any other server-side app except they are typically more CPU- and memory-intensive, are highly stateful, and they utilize storage. All concepts that work the same in containers. On top of that, it’s possible to manage states, limit resources, and restrict network access.

3. Cluster upscale or downscale

The practice addresses the uncertainties of how successful an application will be and the volume it will require by improving the elasticity of its infrastructure. Database containerization accommodates application elasticity; growing when needed and also shrinking to useful infrastructure support. Adding more nodes to clusters can help rebalance data in the background.

4. Data locality and networking

Network scaling has been a significant challenge in modern virtualized data centers. Usually, load balancers take all traffic in the first run and then distributes to the application containers. The application containers then have to communicate to the databases thereby creating more traffic. Containerization brings the database and the application a little closer together alleviating some of the networking issues.

Efficiently Deploy Databases in Containers

Putting databases in containers comes with inherent obstacles to overcome. Databases manifest some fundamental properties that make it hard for them to be containerized effectively. These include their ability to handle persistent storage of data which is critical. The need for disk space to store large amounts of data, and the complex configuration layers required which create a limitation for database containerization. The practice also suffers from the need for high throughput and low latency networking capabilities.

If you are going to put your database in your container then it’s advisable to use the container orchestration platform Kubernetes. The StatefulSets feature of K8s was designed to overcome the very problems that occur when attempting to build and run database clusters inside containers.

If you really and truly have to go down this path, where possible try to use stable Helm Charts to help you get there. Please note though that this doesn’t mean the deployment will still be as stable as through a managed service, but a lot of the heavy lifting will be done for you. K8s will build, deploy, and label your containers concurrently and the self-healing element maintains your cluster health. Ensure the Chart you choose implements the database with StatefulSets and persistent volume claims (to store the actual data in the event of a failure).

Why StatefulSets?

There you have it. If you really must put your database in a container, then use Kubernetes StatefulSets to help you get there. Most importantly though, ask yourself why you’re doing it in the first place and if you really need to…?

Caylent offers DevOps-as-a-Service to high growth companies looking for help with microservices, containers, cloud infrastructure, and CI/CD deployments. Our managed and consulting services are a more cost-effective option than hiring in-house and we scale as your team and company grow. Check out some of the use cases and learn how we work with clients by visiting our DevOps-as-a-Service offering.

 

 

 

 

Top