Running Apache Superset in a Docker

A couple of days back, I wrote the post about how to run Apache Superset in the production environment for serving hundreds or thousands of users. Superset community members and users appreciated the post for which I am thankful; however, over the Superset Slack and Gitter channels, many users asked various questions on setting Superset as a Docker container and how to use/run it. In this post, I am trying to explore more about the Docker image of a Superset, and I am hoping that after reading the post you will acquire a conceptual understanding of setting Superset as a Docker container and benefits of it.

Container? Image?

First, let’s quickly understand what exactly terms "container" and "image" mean and how they are related to Docker.

As per Wikipedia, any structure which holds product for storage, packaging, and shipping is a container. Same applies for the container in a software world.

A container is a standard unit of software that packages up the code and all its dependencies, so the application runs quickly and reliably from one computing environment to another.


Now, let’s look what a term image means.

A container image is a lightweight, standalone, executable package of software that includes everything needed to run an application: code, run-time, system tools, system libraries, and settings.


Finally, relationship with Docker.

Container images become containers at runtime and in the case of Docker containers — images become containers when they run on Docker Engine. Available for both Linux and Windows-based applications, containerized software will always run the same, regardless of the infrastructure.

There are many other container runtime environments, but Docker among them is the most popular one.

Back to Superset Docker Image

There are multiple active repositories and images of Superset available over GitHub and DockerHub. Below is a list of some of them.

Why so many repositories? Are they different? Aren’t they suppose to be the same and provide the same functionality, i.e., packaging the Superset and its dependencies? Yes, they should be identical, but there are multiple different ways and modes to start the Superset. An image should be generic for handling all method and commands which is not the case, and that’s why there are multiple repositories.

I started working on Superset with the perspective of running it in a completely distributed manner so that hundreds or thousands of users can access the Superset concurrently. In the beginning, I was exploring the Apache Superset code but realized that several changes are required to run Superset multiple containers for a distributed architecture and that’s why I decided to have a separate repository.

Features of the Docker image of Superset

Starting the container using the command  docker-compose  will start three containers. mysql5.7 as the metadata database, redis3.4 as a cache and celery broker, and Superset container.

Starting the container by using commanddocker run can be a used for a complete distributed setup, requires metadata database & Redis URL for starting the container.

How to Run

docker-superset
     |__config
     |    |__superset_config.py
     |
     |__docker-files
     |    |__docker-compose.yml
     |    |__.env   


Starting a Superset image as a superset container in a local mode:

cd docker-superset/docker-files/ && docker-compose up -d

Starting a Superset image as a superset container in a prod mode:

cd docker-superset/docker-files/ && SUPERSET_ENV=prod SUPERSET_VERSION=<version-tag> docker-compose up -d


Starting a superset image as a server container:

cd docker-superset && docker run -p 8088:8088 -v config:/home/superset/config/ abhioncbr/docker-superset:<version-tag> cluster server <superset_metadata_db_url> <redis_url>

Starting a superset image as a worker container

cd docker-superset && docker run -p 5555:5555 -v config:/home/superset/config/ abhioncbr/docker-superset:<version-tag> cluster worker <superset_metadata_db_url> <redis_url>


docker pull abhioncbr/docker-superset:<version-tag>


Extending Superset Docker Image

Happy Superset Exploration!!!

 

 

 

 

Top