Time Series Analysis: VAR-Model-As-A-Service Using Flask and MinIO

VAR-As-A-Service is an MLOps approach for the unification and reuse of statistical models and machine learning models deployment pipelines. It is the second of a series of articles that is built on top of that project, representing experiments with various statistical and machine learning models, data pipelines implemented using existing DAG tools, and storage services, both cloud-based and alternative on-premises solutions. This article focuses on the model file storage using an approach also applicable and used for machine learning models. The implemented storage is based on MinIO as an AWS S3-compatible object storage service. Furthermore, the article gives an overview of alternative storage solutions and outlines the benefits of object-based storage.

The first article of the series (Time Series Analysis: VARMAX-As-A-Service) compares statistical and machine learning models as being both mathematical models and provides an end-to-end implementation of a VARMAX-based statistical model for macroeconomic forecast using a Python library called statsmodels. The model is deployed as a REST service using Python Flask and Apache web server, packaged in a docker container. The high-level architecture of the application is depicted in the following picture:

VARMAX-AS-A-Service

The model is serialized as a pickle file and deployed on the web server as part of the REST service package. However, in real projects, models are versioned, accompanied by metadata information, and secured, and the training experiments need to be logged and kept reproducible. Furthermore. from an architectural perspective, storing the model in the file system next to the application contradicts the single responsibility principle. A good example is a microservice-based architecture. Scaling the model service horizontally means that each and every microservice instance will have its own version of the physical pickle file replicated over all the service instances. That also means that the support of multiple versions of the models will require a new release and redeployment of the REST service and its infrastructure. The goal of this article is to decouple models from the web service infrastructure and enable the reuse of the web service logic with different versions of models.

Before diving into the implementation, let's say a few words about statistical models and the VAR model used in that project. Statistical models are mathematical models, and so are machine learning models. More about the difference between the two can be found in the first article of the series. A statistical model is usually specified as a mathematical relationship between one or more random variables and other non-random variables. Vector autoregression (VAR) is a statistical model used to capture the relationship between multiple quantities as they change over time. VAR models generalize the single-variable autoregressive model (AR) by allowing for multivariate time series. In the presented project, the model is trained to do forecasting for two variables. VAR models are often used in economics and the natural sciences. In general, the model is represented by a system of equations, which in the project are hidden behind the Python library statsmodels. 

The architecture of the VAR model service application is depicted in the following picture:

architecture of the VAR model service application

The VAR runtime component represents the actual model execution based on parameters sent by the user. It connects to a MinIO service via a REST interface, loads the model, and runs the prediction. Compared to the solution in the first article, where the VARMAX model is loaded and deserialized at application startup, the VAR model is read from the MinIO server each time a prediction is triggered. This comes at the cost of additional loading and deserialization time but also with the benefit of having the latest version of the deployed model at every single run. Furthermore, it enables dynamic versioning of models, making them automatically accessible to external systems and end-users, as will be shown later in the article. Note that due to that loading overhead, the performance of the selected storage service is of great importance.

But why MinIO and object-based storage in general?

MinIO is a high-performance object storage solution with native support for Kubernetes deployments that provides an Amazon Web Services S3-compatible API and supports all core S3 features. In the presented project, MinIO is in Standalone Mode, consisting of a single MinIO server and a single drive or storage volume on Linux using Docker Compose. For extended development or production environments, there is the option for a distributed mode described in the article Deploy MinIO in Distributed Mode.

Let's have a quick look at some storage alternatives while a comprehensive description can be found here and here:

A summary of the main benefits of object storage are:

It is time to look at the implementation of the VAR model as a service and the integration with MinIO. The deployment of the presented solution is simplified by using Docker and Docker Compose. The organization of the whole project looks as follows:

organization of the whole project

As in the first article, the preparation of the model is comprised of a few steps that are written in a Python script called var_model.py located in a dedicated GitHub repository :

Those steps can also be implemented as tasks in a workflow engine (e.g., Apache Airflow) triggered by the need to train a new model version with more recent data. DAGs and their applications in MLOps will be the focus of another article.

The last step implemented in var_model.py is storing the serialized as a pickle file model in a bucket in S3. Due to the flat structure of the object storage, the format selected is: 

<bucket name>/<file_name>

However, for file names, it is allowed to use a forward slash to mimic a hierarchical structure, keeping the advantage of a fast linear search. The convention for storing VAR models is as follows:

models/var/0_0_1/model.pkl

Where the bucket name is models, and the file name is var/0_0_1/model.pkl and in MinIO UI, it looks as follows:

models

This is a very convenient way of structuring various types of models and model versions while still having the performance and simplicity of flat file storage. 

Note that the model versioning is implemented as part of the model name. MinIO provides versioning of files, too, but the approach selected here has some benefits:

Once the model is deployed, it is time to expose it as a REST service using Flask and deploy it using docker-compose running MinIO and an Apache Web server. The Docker image, as well as the model code, can be found on a dedicated GitHub repository.

And finally, the steps needed to run the application are:

  1. Deploy application: docker-compose up -d
  2. Execute model preparation algorithm:  python var_model.py (requires a running MinIO service)
  3. Check if the model has been deployed: http://127.0.0.1:9101/browser
  4. Test model: http://127.0.0.1:80/apidocs

After deploying the project, the Swagger API is accessible via <host>:<port>/apidocs (e.g., 127.0.0.1:80/apidocs). There is one endpoint for the VAR model depicted next to the other two exposing a VARMAX model:

Swagger API

Internally, the service uses the deserialized model pickle file loaded from a MinIO service:

deserialized model pickle file

Requests are sent to the initialized model as follows:

Requests are sent to the initialized model

The presented project is a simplified VAR model workflow that can be extended step by step with additional functionalities like:

These future improvements will be the focus of upcoming articles and projects. The goal of this article is to integrate an S3-compatible storage API and enable the storage of versioned models. That functionality will be extracted in a separate library soon. The presented end-to-end infrastructural solution can be deployed on production and improved as part of a CI/CD process over time, also using the distributed deployment options of MinIO or replacing it with AWS S3.

 

 

 

 

Top