AIMultiple ResearchAIMultiple Research

Experiment Tracking: What it is, Best Practices & Tools in 2024

Updated on Dec 22
3 min read
Written by
Cem Dilmegani
Cem Dilmegani
Cem Dilmegani

Cem is the principal analyst at AIMultiple since 2017. AIMultiple informs hundreds of thousands of businesses (as per Similarweb) including 60% of Fortune 500 every month.

Cem's work focuses on how enterprises can leverage new technologies in AI, automation, cybersecurity(including network security, application security), data collection including web data collection and process intelligence.

View Full Profile

Machine learning is changing every industry and business function with hundreds of applications and use cases. However, AI and ML projects have a high rate of failure. Inspired by DevOps practices for software development, MLOps brings efficiency to machine learning model development processes and reduce the chance of failure for ML projects.

Developing ML models involves lots of trial and error, or experiments. Tracking these experiments in a structured way is an integral part of every successful ML project. In this article, we’ll explore one of the components of MLOps practices: experiment tracking.

What is experiment tracking?

Experiment tracking is the practice of keeping track of important information (metadata) about different experiments when developing a machine learning model. Experiments and metadata about them can mean:

  • Different ML models
  • Model hyperparameters such as the size of a neural network
  • Different versions of training data
  • Codes used in model development

This is a non-exhaustive list and the important metadata about experiments depends on the project.

Why is experiment tracking important?

Tracking ML model experiments in a structured manner enables data scientists to identify the factors that affect model performance, compare the results, and select the optimal version.

A typical process of developing an ML model involves collecting and preparing training data, selecting a model, and training the model with prepared data. A small change in the training data, model hyperparameters, model type, or code that is written to run the experiment can drastically change model performance. 

Data scientists usually run different versions of the model by changing the components of the model. Hence, achieving the best-performing model according to one or more performance evaluation metrics is an iterative process. Without tracking the experiments conducted during the ML model development process, it is not possible to compare and reproduce the results of different iterations.

How can experiment tracking be implemented?

Manually recording all the information about different experiments to spreadsheets is an option for tracking experiments especially if you don’t run too many experiments. However, machine learning projects typically involve numerous variables to track, and these variables have complex relationships with each other. Therefore, manually tracking experiments can be time-consuming and hard to scale.

Fortunately, there are tools that are designed to track machine learning experiments. These tools:

  • Provide a hub to store different ML projects and their experiments
  • Can be integrated with various model training frameworks
  • Automatically register all the information you want about experiments
  • Have user-friendly UI to search and compare experiments
  • Leverage visualizations to represent experiments which helps users interpret the results quickly. Visualizations also help in communicating the results to others with no technical background.
  • Can let you track hardware consumption of different experiments

What are the best practices for tracking ML experiments?

To achieve the most of your ML tracking, you need to define:

  • the objective of the experiment
  • evaluation metrics (accuracy, explainability, etc.)
  • experiment variables (different models, hyperparameters, datasets, etc.)

For example, if you are trying to increase model accuracy, determine the accuracy metrics and hypothesize “If we use the model X, it will deliver more accuracy compared to model Y”. Trying several things without a framework and picking the best one is counterproductive if you haven’t decided what makes an experiment successful.

What are the tools for experiment tracking?

There are both open source and commercial tools for tracking experiments. Some popular tools include:

NameStatusLaunched In
CometPrivate2017
Guild AIOpen Source2019
ModelDBOpen Source2020
Neptune.aiPrivate2017
TensorBoardOpen Source2017
Weights & BiasesPrivate2018
MLFlow TrackingOpen Source2018

Experiment tracking is a part of the MLOps practices to streamline machine learning model development processes. MLOps platforms that provide end-to-end machine learning lifecycle management also include tools for experiment tracking. Some MLOps platforms are:

  • Amazon SageMaker
  • H2O.ai
  • Iguazio
  • MLFlow

You can also check our MLOps tools article for a comprehensive account.

If you have more questions about experiment tracking and about vendors, we would like to help:

Find the Right Vendors
Cem Dilmegani
Principal Analyst

Cem is the principal analyst at AIMultiple since 2017. AIMultiple informs hundreds of thousands of businesses (as per Similarweb) including 60% of Fortune 500 every month.

Cem's work focuses on how enterprises can leverage new technologies in AI, automation, cybersecurity(including network security, application security), data collection including web data collection and process intelligence.

Cem's work has been cited by leading global publications including Business Insider, Forbes, Washington Post, global firms like Deloitte, HPE, NGOs like World Economic Forum and supranational organizations like European Commission. You can see more reputable companies and media that referenced AIMultiple.

Cem's hands-on enterprise software experience contributes to the insights that he generates. He oversees AIMultiple benchmarks in dynamic application security testing (DAST), data loss prevention (DLP), email marketing and web data collection. Other AIMultiple industry analysts and tech team support Cem in designing, running and evaluating benchmarks.

Throughout his career, Cem served as a tech consultant, tech buyer and tech entrepreneur. He advised enterprises on their technology decisions at McKinsey & Company and Altman Solon for more than a decade. He also published a McKinsey report on digitalization.

He led technology strategy and procurement of a telco while reporting to the CEO. He has also led commercial growth of deep tech company Hypatos that reached a 7 digit annual recurring revenue and a 9 digit valuation from 0 within 2 years. Cem's work in Hypatos was covered by leading technology publications like TechCrunch and Business Insider.

Cem regularly speaks at international technology conferences. He graduated from Bogazici University as a computer engineer and holds an MBA from Columbia Business School.

Sources:

AIMultiple.com Traffic Analytics, Ranking & Audience, Similarweb.
Why Microsoft, IBM, and Google Are Ramping up Efforts on AI Ethics, Business Insider.
Microsoft invests $1 billion in OpenAI to pursue artificial intelligence that’s smarter than we are, Washington Post.
Data management barriers to AI success, Deloitte.
Empowering AI Leadership: AI C-Suite Toolkit, World Economic Forum.
Science, Research and Innovation Performance of the EU, European Commission.
Public-sector digitization: The trillion-dollar challenge, McKinsey & Company.
Hypatos gets $11.8M for a deep learning approach to document processing, TechCrunch.
We got an exclusive look at the pitch deck AI startup Hypatos used to raise $11 million, Business Insider.

To stay up-to-date on B2B tech & accelerate your enterprise:

Follow on

Next to Read

Comments

Your email address will not be published. All fields are required.

0 Comments