Model Serving Platform

The lifecycle of a Machine Learning model is traditionaly managed in a very manual way. Sidra allows users to simplify it by means of Model Serving Platform. This platform uses Azure Databricks, MLflow and AzureML as a services to perform typical tasks in a Machine Learning workflow like tracking performance measures, manage model versions, or deploy a model.

These services are selected in this platform for the following reasons: - Platform as a Service

1
These services work completely on the cloud as PaaS services, so there is no need of any virtual (or real) machine for that. Removing the need of a dedicated machine reduces the complexity of deploy and the maintenance effort required, and improves the scalability.
  • No Licensing Costs

    There is no licensing cost associated, like most third party products. Costs are based on usage only. As part of the Azure infrastructure, it benefits for any existing Enterprise Agreement between the client (whose solution will use Sidra) and Microsoft.

  • Integrated Workspace

    As part of Sidra Infraestructure, working on Databricks allows Data Scientists and Researchers to use Datalake tables as datasets to develop new Machine Learning models. This is performed by means of Databricks notebooks, that can also be shared accross teams.

  • Track experiments

    The use of MLflow allows teams to effectively track their experiments. Futhermore, the existence of an API facilitates its integration and consequently, experiments' data can be retrieved to the Model Serving Platform.

  • Publish models as a service

    By means of AzureML, it is so easy to deploy a model as a web service through its API or multiple SDK, like the Python one. This feature clearly provides a strong benefit to the Model Serving Platform when it is necessary to manage the deployment of a model.

Machine Learning Workflow

A Machine Learning Workflow is just a sequence of tasks that needs to be performed in order to solve a specific problem by means of Machine Learning. There are several well-known standards of Machine Learning workflow. Nonetheless, the more complete and clear view of it is the one showed in figure below. The key element is the data scientist -or user, as depicted in the figure-, which is the person in charge of define, create and evaluate the different approaches to build a model that solves a specific problem. Alongside with this, we can distinguish and remark the following pieces on this workflow:

  • Data collection: this is the part where the data is gathered from different sources and stored in a particular format, suited for the data scientists' need.

  • Data processing: this is the part where the data is wrangled in order to adapt it to the model.

  • Model development: after the data processing step, the model is created and it is evaluated with some out-of-the-bag samples. This step will provide a performance measure to know how good the model is. In this step, several models are tried and the depicted Train-Test loop comes into scene, where the different models are tuned independently or even are stacked to fulfill the needs of the application.

  • Model productionization: once a model has been chosen from the bag of models trained during the development step, it has to be deployed in order to serve other users/applications. Even though this model is in production already, it will be continously monitored in order to check if its performance is being affected and if it needs human intervention.

ml-workflow

Due to its nature, this is an iterative process, constantly changing and evolving.

MLflow

In order to mantain a machine learning model it is needed to iterate over the aforementioned steps. Azure Databricks makes this problem easy for us due to its integration with MLflow. MLflow is an open source platform that helps us to experiment, reproduce and deploy machine learning models.

mlflow-ui

The introduction of MLflow was motivated for some reasons:

  1. It's difficult to keep track of the experiments. Just a notebook is not enough to report and easily review several iterations of a model.

  2. It's difficult to reproduce the code in the sense of, having exactly the same environment used in the former project.

  3. It's difficult to promote models into production. Serving a model is always challenging for a Data Science team, in particular when several Machine Learning libraries are involved in the project.

MLflow consists of the following modules:

  • MLflow Tracking: it provides an API and UI for logging parameters, metrics, code and outputs during the ML model development.

  • MLflow Projects: it is a format for packaging data science code in a reusable and reproducible way. This is really helpful when the model is completely developed. Its latest feature allows the user to even run these MLflow Projects on a Kubernetes cluster, thus taking advantage of the speed and scalability of these clusters.

  • MLflow Models: it allows users to package the models in order to be used in a variety of downstreams tools, such as real-time serving through a REST API or batch inference on Apache Spark.

  • MLflow Model Registry: it relies on a centralised model store, a set of APIs, and UI, to collaboratively manage the full lifecycle of an MLflow Model. It provides model lineage (which MLflow experiment and run produced the model), model versioning, stage transitions (for example from staging to production), and annotations.

MLOps

The philosophy of MLOps is to integrate the DevOps culture with good practices in Machine Learning Projects development in order to have a continuous integration and continuous deployment of the models. The idea is to mix together the DevOps techniques and MLflow to reduce the effort of promoting a model into a production environment.

ci-cd-ml-model

Using Model Serving platform a user is able to deploy any Machine Learning model in just three steps:

  • Register the model from a MLflow run.

  • Create a Docker Image for the model using Azure Machine Learning Workspace.

  • Deploy the model into a Kubernetes Cluster or Azure Container Instance, depending on the purpose.

You can find more information about how to execute these steps in our platform on How to use Model Serving Platform

Thanks to our platform and Azure DevOps, these steps are so easy to integrate into CI/CD pipelines.