Developing a machine learning (ML) application involves lots of trial and error with different datasets, ML models, model parameters, or codes to achieve the best results. Once you want to scale your ML initiatives across your organization, managing different versions of multiple models can get pretty complicated.
Given the high levels of AI failure rates, companies need a structured way to manage their models for successful ML applications. In this article, we’ll explore model registry, a tool designed to manage models systematically.
What is a model registry?
A model registry is an archive or a central repository that stores all models and their metadata information. It provides a user interface to access and retrieve models.
Think you want to cite a sentence while writing an article but cannot remember the source. Unless you find the source material, it would be impossible to cite. Finding the resource can take a long time if you insist on using it. However, using a note-taking application could prevent such a time loss. A model registry has the same function in the context of machine learning.
Through model registry, it is possible to store and find:
- The software and the data that is used during the training process,
- The evaluation metrics,
- The output that is obtained as a result of the training process, i.e., the model artifacts,
- Models with their all versions.
Why is it important?
A model registry makes collective action easier. Thanks to its centralized storage, the most up-to-date version of all models can be found. Thus, data scientists can avoid the risk of working on overlapping problems or falling into the same mistakes. Being informed of others’ actions both enable joint work and also save time. Moreover, since all versions of a model are stored, a data scientist can develop a version that another team member abandons because it is unrelated to their purposes, but that may be useful for them.
Enables efficient model lifecycle management
A model registry makes the lifecycle of models transparent. In this way, each team member can keep track of a model’s progress. In addition to being stored, the different versions of a model can be found with ease. It also enables monitoring and cross-checking the performance of models.
Streamlines model deployment
A model registry facilitates model deployment in which models are pushed into production. By tracking, monitoring, comparing, and searching all the models, data scientists can streamline the transition from development to production.
How is it different from experiment tracking?
A model registry can be confused with experiment tracking. However, even though experiment tracking allows tracking different versions of a model and storing training data, they serve different purposes.
Experiment tracking tools are primarily used during the model development process to keep track of different experiments of models. The experimentation phase of a model ends once it enters production. On the other hand, a model registry stores all models including those that are under development, trained, deployed, or retired. In this way, it provides a bigger picture of different models and their lifecycles compared to an experiment tracking tool.
How to get started with model registry?
A model registry is typically a part of MLOps or data science platforms. Some MLOps platforms that provide model registry feature are:
- Amazon SageMaker
Cem has been the principal analyst at AIMultiple since 2017. AIMultiple informs hundreds of thousands of businesses (as per similarWeb) including 60% of Fortune 500 every month.
Cem's work has been cited by leading global publications including Business Insider, Forbes, Washington Post, global firms like Deloitte, HPE, NGOs like World Economic Forum and supranational organizations like European Commission. You can see more reputable companies and media that referenced AIMultiple.
Throughout his career, Cem served as a tech consultant, tech buyer and tech entrepreneur. He advised businesses on their enterprise software, automation, cloud, AI / ML and other technology related decisions at McKinsey & Company and Altman Solon for more than a decade. He also published a McKinsey report on digitalization.
He led technology strategy and procurement of a telco while reporting to the CEO. He has also led commercial growth of deep tech company Hypatos that reached a 7 digit annual recurring revenue and a 9 digit valuation from 0 within 2 years. Cem's work in Hypatos was covered by leading technology publications like TechCrunch and Business Insider.
Cem regularly speaks at international technology conferences. He graduated from Bogazici University as a computer engineer and holds an MBA from Columbia Business School.
To stay up-to-date on B2B tech & accelerate your enterprise:Follow on
Next to Read
Your email address will not be published. All fields are required.