It’s scary to learn that over 80% of machine learning[ML] models never make it to production because of the complex model deployment procedures.. Even among companies whose ML models have successfully made it to production, many report having convoluted Machine Learning Operations (MLOps) pipelines powering their ML systems.
It can be inefficient for companies to have a separate team to handle their ML products in deployment. For small and medium-scale businesses, this also translates to a hefty additional cost in hiring and infrastructure management.
This article will explain why you do not need to maintain two separate pipelines: DevOps and MLOps. You will also learn about KitOps, a new open source tool, that unifies these pipelines saving cost and effort.
Let's start by discussing why you do not want separate operations pipelines within your company.
Why is it undesirable to keep separate pipelines for DevOps and MLOps?
DevOps and MLOps share the same goal, both are concerned with keeping software up to date in the production environment. However, since DevOps is an earlier concept than MLOps, it wasn't built to accommodate the iterative process of ML model development. Hence, the two fields diverged.
Regardless of how and why, they diverged, most of the time, the team and the organization behind the team had to pay the price. Here are a few cons of having two different operations teams:
Siloed knowledge
When two operations teams work independently, the knowledge and expertise may become siloed, resulting in inefficient development. Additionally, vulnerabilities may be introduced when one team is unaware of the implementation details of another team's methodologies.Repetition of efforts and work duplication
This is a side effect of siloed knowledge. When there is a communication gap between the teams, they may independently work on the same problem. Such unnecessary repetition can introduce friction in the development speed.Slower development
In software engineering, each additional team on a project introduces communication overhead, which can slow down the pace of development.Increased cost
It is obvious that hiring for and maintaining two teams instead of a single team increases the budget requirements for the team.
Why consider merging MLOps and DevOps into a single pipeline?
Having learned the consequences of maintaining two separate operations pipelines, wouldn't using the same pipeline for conventional and ML-powered software applications be better? Indeed, it would. And if there was a tool to help you unify both pipelines, what features would it need? Well, it would require three key features:
- Wide compatibility and easy integration
- Compliance
- Support for non-proprietary standards and large artifacts
Wide compatibility and easy integration
The tool should be able to support libraries and frameworks used by both software engineers and data scientists.
Additionally, it should easily integrate with the libraries and frameworks used by software engineers and data scientists and be easy to learn so that engineers can easily integrate it with their current workflow.
Some commonly used tools and services by software developers, data engineers, data scientists, machine learning engineers, DevOps engineers, and MLOps engineers include:
- Version control and CI/CD: Git, GitHub, Bitbucket, and CircleCI for version control, automated testing, and continuous integration or continuous delivery (CI/CD).
- Data workflow orchestration: Airflow for orchestrating complex data workflows and managing data pipelines.
- Model registry and deployment: MLFlow and Neptune.ai for model registry, model packaging, and deploying machine learning models.
- Configuration management: Hydra manages configurations in machine learning experiments and deployments.
- Machine learning frameworks: PyTorch and TensorFlow are used to build, train, and evaluate machine learning models, run experiments, and make predictions.
- Data and model versioning: Data version control (DvC) for data versioning, model versioning, and integrating CI/CD into machine learning workflows.
- Cloud services: AWS, Azure, and GCP for deploying machine learning models in scalable environments.
- Workflow orchestration: Ray is responsible for orchestrating distributed machine learning workflows.
Compliance
The European Union (EU) regularly updates its regulations to make companies employing AI for decision-making more accountable. Along the same trajectory, it recently made a new regulation requiring companies be able to link the model with the exact dataset it was trained on for up to 10 years after the model completed its training.
Most tools in the ecosystem do not accommodate such stringent requirements.
Support for non-proprietary standards and large artifacts
The key difference between developing machine learning and other software engineering projects is that ML projects often involve and produce large artifacts (data and models). Tools like Git cannot be used to manage (track and share) them. There are external tools that support such large artifacts but they are stored using proprietary standards, making them less portable. As such, if a tool can support non-proprietary standards and accommodate large artifacts - data versioning and tracking - it can very likely be used across both MLOps and DevOps teams.
Does a tool with all these features exist? Well, KitOps provides the above three features with only a minimal learning curve. It is flexible and can easily be used with current workflow. Let’s learn how it can help us unify the two pipelines.
How KitOps unifies your DevOps and MLOps pipeline
KitOps is an open source tool that helps you securely package and version code, data, and even large artifacts. The fascinating thing about KitOps is it relies on open source tools and standards and stores its assets in nonproprietary formats (OCI-compatible artifacts). This makes it easy to use KitOps in combination with other open source tools, including any modern CI/CD pipeline.
Additionally, KitOps generates a checksum for each artifact (code, data, model, etc.) in the version and, hence, can easily track changes between versions and link one artifact to the other in a version, ensuring traceability and tamper-proof packaging.
With KitOps, your development pipeline, regardless of the type of project, ML or conventional software engineering project, can be summarized into three steps:
- Local development and experimentation
- Testing
- Deployment and monitoring
Local development and experimentation
In this phase, a developer writes code to add new features or improve existing ones. For conventional software development, this could mean writing patches, fixing bugs, etc. Additionally, for machine learning projects, this includes model training, fine-tuning with new data, code for data preparation, feature engineering, data management, experiment tracking, model training pipeline, etc.
Once the application works as expected, the developer can write a Kit file, analogous to a Dockerfile, and register the artifacts: code, data, model, static files, etc. The developer then packages the artifacts into a ModelKit using kit pack and pushes them to a remote registry using kit push
.
For example:
Once you install Kit, log in to a remote registry and complete the necessary changes to your code. You can write a Kit file similar to this:
manifestVersion: v1.0.0
package:
name: Python Example
description: 'A simple python app'
license: Apache-2.0
authors: [Jozu]
code:
- path: .
description: Sample app code
You can register anything, including code, model, or dataset, to your Kitfile.
manifestVersion: v1.0.0
package:
name: Python Example
description: 'A simple python app'
license: Apache-2.0
authors: [Jozu]
code:
- path: .
description: Sample app code
model:
# <Model metadata and location here>
datasets:
# <Dataset metadata and location here>
Then, you can package the content into a ModelKit and push it to a remote registry:
kit pack . -t YOUR_CONTAINER_REGISTRY_URL/APP_NAME:TAG
kit push YOUR_CONTAINER_REGISTRY_URL/APP_NAME:TAG
Testing
At this stage, you pull and test the latest changes. Since ModelKits are compatible with most tools, Developers can use their favorite open source tools and cloud platforms for testing. For instance, you could write a GitHub Actions pipeline that pulls the relevant artifacts from the ModelKit using kit pull and executes tests.
A minimal setup includes unpacking the relevant artifacts (for now, it is the code) and testing them using your desired tool (pytest
for now).
kit unpack YOUR_CONTAINER_REGISTRY_URL/APP_NAME:TAG --code
pytest
Once the tests pass, you can build a container image, package it with the code, and push it to the remote registry.
kit pack . -t YOUR_CONTAINER_REGISTRY_URL/APP_NAME:TAG
kit push YOUR_CONTAINER_REGISTRY_URL/APP_NAME:TAG
Deployment and monitoring
Once the tests are successful, you can pull the necessary artifacts (commonly a docker image) to the production server and deploy the updates. You can also build a monitoring dashboard using your favorite tools (Grafana, Streamlit, etc.) to gauge application metrics or employ model monitoring for deployed machine learning models.
Use the following commands to unpack the code from your remote registry to your deployment server and deploy it.
kit unpack YOUR_CONTAINER_REGISTRY_URL/APP_NAME:TAG --code
docker run <CONTAINER_NAME>:<TAG>
Wrapping up
The pipeline could consist of more steps depending on the consensus among team members and the organization's practices. However, the bottom line is that you can still use KitOps/ModelKits and have a single deployment pipeline for both ML and conventional software engineering projects. To learn more about KitOps, visit the project site, which also has an easy-to-follow guide to get you started.