Introduction to Why and How to apply Machine learning operations
Introduction to Why and How to apply Machine learning operations
22 February 2021
What are Machine learning operations?
Machine learning operations (MLOps) is the usage and implementation of ML models in DevOps (development/operations). MLOps defines processes to make ML development more reliable and productive by adding disciplines to the development and deployment.
Machine learning systems can be considered as a special type of software system. Building an ML model is not a real challenge, but it is building an integrated ML system and continuously operating it in production.
Frequent challenges data scientists are facing while trying to implement their models in operations are-
- Keeping the track of the different versions of the code
- Keeping the track of implemented/failed models/ideas
- Reproducibility – Not able to rerun the best model with a more through parameter sweep
- Traceability – The model needs to be updated on a regular basis as new data comes in
How do we reduce the time between analyzing the problem? creation of the models and deployment of the solution, while maintaining the quality of the output. This approach is called DevOps or simply ML apps.
Machine learning lifecycle
Continuous Integration (CI):
All the software developers from the team working on a project don’t work on the same code at a time. They test their code if it is working as per their use case and code base, perform the unit test with the main version and they merge it back after. This process is done more frequently with the main version of the code. This process is called Continuous Integration or CI.
Continuous Delivery (CD):
This is a method for building, testing and releasing software in short cycles. By this way, the main development code is ideally always production ready, it can be released into the light environment at any time. Continuous delivery can be manual or automatic.
Continuous Training (CT):
In machine learning operations, continuous integration of source code, unit testing, integration testing in continuous delivery of the software to production is very important. But data is one of the most important aspects to MLOps. So we need continuous integration and continuous delivery, and Continuous Training as per the new data arriving. Basically, we need to continuously train our models as per the real time data. This can be done by regular monitoring and retraining the models.
Phases of Machine learning project
The designing a machine learning model can be divided into three main phases:
- Discovery Phase: Identifying the business requirements in its use case and discovering the type of data you will be handling will be very crucial while building the model
- Development phase: Defining the algorithm is a main task and data steps such as cleaning, extracting, analyzing, and transforming, will be implemented during the data pipeline creation. After the data is ready, building and evaluating the model begins, then a couple of iterations until you are happy with the results and ready to present them to clients or stakeholders.
- Deployment phase: Deployment requires hosting the train model and serving it and having a prediction service ready to handle requests. Finally, monitoring will continuously evaluate and train based on the current results.
I hope you find this information in beginning your journey to MLops, All the best!