What is Machine learning Life Cycle in 2021? In this era, everyone is learning machine learning (ML). It seems that every company that collects data is trying to figure out some way to use AI and Machine learning to analyze its business and provide automated solutions.
By 2027, the market value of machine learning is expected to reach 117 billion U.S. dollars—Fortune Magazine
The influx of Machine learning has caused many novices to have no formal background. Happily, more people are starting to get excited and learn this field, which is very happy, but it is clear that integrating Machine learning projects into production environments is not an easy task.
Image from the 2020 State of Enterprise Machine learning by Algorithmia based on 750 businesses
55% of companies have not put their Machine learning models into production yet — Algorithmia
If you have the data and computing resources necessary to train the model, many people seem to think that the Machine learning project is very simple. They made another mistake. If the model is not deployed, this assumption seems to result in significant time and money costs.
In this article, we will discuss what the life cycle of an ML project looks like, and some tools that can help solve the problem.
Machine learning life cycle
Machine learning projects are not simple and straightforward. They are a cycle between improving data, models, and evaluations, and have never really been completed. This cycle is crucial for developing machine learning models because it focuses on using model results and evaluations to refine the data set. High-quality data sets are the most reliable way to train high-quality models. The speed of such repeated cycles determines your cost. Fortunately, some tools can help speed up the cycle without sacrificing quality.
As with any system, even deployed Machine learning models need to be monitored, maintained, and updated. You can’t just deploy the ML model and forget it, but expect it to work the rest of the time as it does on the test set in the real world. When you find deviations in the model, add new data sources, need other functions, etc., the ML model deployed in the production environment will need to be updated. This brings you back to the data, model, and evaluation cycle.
As of 2021, deep learning has been important for more than a decade and has helped make Machine learning the leader and center of the market. The machine learning industry is booming, and countless products have been developed to assist in the creation of machine learning models. Every step of the ML life cycle has some tools that you can use to speed up the process without becoming one of the companies without ML projects.
The next section will delve into each stage of the machine learning life cycle and focus on popular tools.
Phase 1: Data
Although the ultimate goal is to build a high-quality model, the lifeline of training a good model lies in the amount of data transferred, and more importantly, the quality.
The main data-related steps in the Machine learning life cycle are:
Data collection-Collect as much raw data as possible regardless of the quality. Finally, in any case, only annotate a small part of it, which is the source of most of the cost. When there is a problem with model performance, you can add a lot of data as needed, which is useful.
- List of public data sets
Define the annotation mode-this is one of the most important parts of the data phase of the life cycle and is usually ignored. A poorly constructed annotation architecture will lead to ambiguities between classes and edge cases, making training models more difficult.
For example, the performance of an object detection model largely depends on properties such as size, position, orientation, and truncation. Therefore, including attributes such as object size, density, and occlusion during annotation can provide the key metadata needed to create a high-quality training data set that the model can learn from.
- Matplotlib, Plotly — plot the attributes of the data
- Tableau-an analytics platform to better understand your data
Data Annotation-Annotation is a tedious process of performing the same task for hours over and over again, which is why annotation services are a booming business. The result is that the annotator may make many mistakes. Although most annotation companies guarantee a maximum error percentage (for example, the maximum error is 2%), the bigger problem is that the annotation structure is incorrectly defined, causing the annotator to decide to label the sample differently. It is difficult for the quality inspection team of the annotation company to find this, this is what you need to check.
- Scale, Labelbox, Prodigy-popular annotation services
- Mechanical Turk-Crowdsourcing Annotation Platform
- CVAT — DIY computer vision annotation
- Doccano — NLP specific annotation tool
- Centaur Labs-medical data labeling service
Improve data sets and annotations-you will probably spend most of your time here when trying to improve model performance. If your model is learning but performing poorly, the culprit is almost always a training data set that contains biases and errors that are creating a performance cap for the model. Improving the model usually involves solutions such as hard sample mining (adding new training data similar to other samples where the model fails), rebalancing the data set based on the deviations learned by the model, updating annotations and patterns to add new tags, and optimizing existing tags…
- DAGsHub —Data set version control
- FiftyOne — visualize data sets and find errors
Phase 2: Model
Even if the output of this process is a model, ideally, you will still spend the least amount of time in this loop.
In industry, more time is spent on datasets than models. Credit to Andrej Karpathy
Explore existing pre-trained models-the goal here is to reuse as much of the available resources as possible to best start modeling production. Today, transfer learning is the core tenant of deep learning. You may not create a model from scratch, but rather fine-tune an existing model pre-trained on related tasks. For example, if you want to create a mask detection model, you may download a pre-trained face detection model from GitHub, because this is a more popular topic and more work needs to be done.
- FiftyOne Model Zoo—You can download and run the model with just one line of code
- TensorFlow Hub-a a repository of well-trained machine learning models
- modelzoo.co-Pre-trained deep learning models for various tasks and libraries
Build a training loop-your data may be different from the data used for the pre-trained model. For image datasets, things like input resolution and object size need to be considered when setting up the training pipeline for the model. You also need to modify the output structure of the model to match the class and structure of the tags. PyTorch Lightning provides a simple way to expand the scale of model training with limited code.
- Scikit Learn-build and visualize classic ML systems
- PyTorch, PyTorch Lightning, TensorFlow, TRAX-popular deep learning Python libraries
- Sagemaker-Build and train ML system in Sagemaker IDE
Experiment tracking-this entire cycle may require multiple iterations. You will end up training many different models, so being meticulous when tracking different versions of the model and the hyperparameters and data used to train it will greatly help keep things organized.
Phase 3: Evaluation
After trying to obtain a model that has learned the training data, it is time to start mining and see how it performs on the new data.
The key steps to evaluate the Machine learning model:
Visualize model output-Once you have a well-trained model, you need to run it on a few samples immediately and view the output. This is the best way to find out if there are any errors in the training/evaluation pipeline before evaluating the entire test set. It will also show if there are any obvious errors, such as incorrect labeling of your two classes.
- OpenCV, Numpy, Matplotlib-write custom visualization scripts
- FiftyOne — visualize the output of computer vision tasks on images and videos
Choosing the right indicators-proposing one or several indicators can help compare the overall performance of the model. To ensure that you choose the best model for your task, you should develop metrics that meet the ultimate goal. When you find other important qualities to track, you should also update the metrics. For example, if you want to start tracking the performance of the object detection model on small objects, use mAP for objects whose bounding box <0.05 is one of the objects.
Although these overall data set metrics can be used to compare the performance of multiple models, they rarely help to understand how to improve the performance of a model.
- Scikit learning-provide general indicators
- Python, Numpy-Develop custom indicators
View failure cases-everything the model does is based on the data trained on it. So, assuming it can learn something, if its performance is worse than you expected, you need to look at the data. It can be useful to see how well the model is performing, but it is critical to see the false positives and false negatives that the model correctly predicted. After carefully studying these samples, you will begin to see failure modes in the model.
For example, the image below shows a sample from the “Open Image” dataset, with a false positive shown as the rear wheel. It turns out that this false positive is a lack of comments. Validating all-wheel annotations in the dataset and fixing other similar errors can help improve the performance of the model on wheels.
> Image credit to Tyler Ganter (source)
- FiftyOne, Aquarium, Scale Nucleus—Debug data sets to find errors
Develop solutions-Identifying failure cases is the first step in finding solutions to improve model performance. In most cases, it can go back to adding training data, similar to where the model failed, but it can also include things like changing pre-or post-processing steps in the pipeline or fixing annotations. No matter what the solution is, you can only solve the problem by understanding where the model fails.
Phase 4: Production
Finally! You have established a model that can perform your evaluation indicators well and will not cause major errors in various extreme situations.
Now you need:
Monitor model-Test your deployment to ensure that your model still performs as expected against test data such as evaluation metrics and inference speed.
- Pachyderm, Algorithmia, Datarobot, Kubeflow, MLFlow-deployment and monitoring models and pipelines
- Amazon Web Services, Google AutoML, Microsoft Azure-cloud-based solutions for ML models
Evaluate new data-Using models in production means you will often pass brand new data through models that have never been tested. It is important to evaluate and mine specific samples to see how the model handles any new data encountered.
Continue to understand the model-some errors and biases in the model may be deeply ingrained and take a long time to discover. You need to constantly test and explore the model for various edge situations and trends that may cause problems. If these situations may be discovered by customers, they will cause problems.
Extended functions-Even if everything is working properly, the model may not achieve the profit growth you expect. From adding new classes and developing new data streams to making models more efficient, there are countless ways to extend the capabilities of current models to make them better. Whenever you want to improve the system, you need to restart the machine learning life cycle to update the data, model, and evaluate everything to ensure that the new features work as expected.
The above content is very general and unbiased, but I want to introduce you to the tools I have been using.
There are many tools in every part of the Machine learning life cycle. However, there is a very lack of tools to help me highlight some of the key points in this article. Operations such as visualizing complex data (such as images or videos) and tags, or writing queries to find specific situations where the model does not perform well, are usually done through manual scripts.
I have been developing FiftyOne on Voxel51, an open-source data visualization tool designed to help debug datasets and models and fill this gap. FiftyOne enables you to visualize image and video datasets and model predictions in the GUI locally or remotely. It also provides powerful functions to evaluate the model and write advanced queries for any aspect of the data set or model output.
FiftyOne can run on a laptop, so please use this Colab laptop to try it in a browser. Alternatively, you can easily install it using pip.
pip install fifty-one
Only a small percentage of all companies that try to incorporate machine learning (ML) into their business management deploy the model into production. The Machine learning life cycle model is not straightforward but requires continuous iteration between data and annotation improvement, model and training pipeline construction, and sample-level evaluation. If you know what you want to do, then this cycle may eventually produce a model that can be used for production, but it will need to be maintained and updated over time. Fortunately, countless tools have been developed to help with each step of this process.