AIMultiple ResearchAIMultiple Research

Transfer Learning in 2024: What It Is & How It Works

Transfer Learning in 2024: What It Is & How It WorksTransfer Learning in 2024: What It Is & How It Works

Training machine learning models can be a challenging data science task. The training algorithms might not work as intended, training times can take too long, or training data can be problematic. Transfer learning is one of those techniques to make training easier. Just like humans can transfer their knowledge on one topic to a similar one, transfer learning can provide data scientists to transfer insights gained from a machine learning task into a similar one. By that, they can shorten machine learning model training time and rely on fewer data points.

What is transfer learning?

Transfer learning is a machine learning technique that enables data scientists to benefit from the knowledge gained from a previously used machine learning model for a similar task. This learning takes humans’ ability to transfer their knowledge as an example. If you learn how to ride a bicycle, you can learn how to drive other two-wheeled vehicles more easily. Similarly, a model trained for autonomous driving of cars can be used for autonomous driving of trucks. Wikipedia defines transfer learning as follows:

Transfer learning is a research problem in machine learning that focuses on storing knowledge gained while solving one problem and applying it to a different but related problem.

This technique applies to many machine learning models, including deep learning models like artificial neural networks and reinforcement models. The target is to leverage a pre-trained model’s knowledge while performing a different task. For a better understanding, here is a figure showing how transfer learning differs from traditional machine learning methods:

Difference between traditional machine learning and Transfer learning.
Source: Educative

How does it work?

There are 2 similar approaches:

Using a pre-trained model

Here is how transfer learning works:

  1. Select source model: A pre-trained source model is chosen for transferring its knowledge to the target model. 
  2. Adapting the source model to create the target model: Some features of the source model can differ from the training data of the target model. Thus, different features should be arranged before transferring knowledge.
  3. Train source model to achieve the target model: After the source model is tuned, the target model is achieved by using the source model as the starting point.

Developing a new model

In some cases, data scientists may decide to create a new model to transfer their knowledge to the main task. For example, you need different models to detect trucks and buses in images, but you don’t have enough data. Then, you can prefer generating a new model that identifies cars at first. The generated model can be a starting point for identifying trucks and buses and be explicitly trained with the available data.

To find the right AI data partner for your AI models, check out the following articles:

You can check our data-driven list of data collection/harvesting services to find the option that best suits your project needs.

For more in-depth knowledge on data collection, feel free to download our whitepaper:

Get Data Collection Whitepaper

When to use transfer learning?

For transfer learning, data scientists need to have a machine learning model trained on a similar task before. Thus, they cannot use transfer learning in every situation. However, when it is used, transfer learning delivers better results in a shorter amount of time. Data scientists can adopt transfer learning to their operations in the following conditions:

There is not enough data.

In some cases, data scientists might not have enough data to train their machine learning models. Working with an insufficient amount of data would result in lower performance, starting with a pre-trained model would help data scientists build better models. 

There is not enough time to train

Some machine learning models cannot be trained easily and can take too long to work properly. When there is not enough time to build a new model, or there are too many machine learning tasks to handle, data scientists can prefer to adopt a similar pre-trained model. This will save time for building the model, instead of creating a new one.

What are the main benefits?

According to this book about machine learning, transfer learning provides the following benefits:

A line graph showing that transfer learning gives better results than traditional.
Source: Machine Learning Mastery
  • Better initial model: In other types of learning, you need to build a model without any knowledge. Transfer learning offers a better starting point and can perform tasks at some level without even training.
  • Higher learning rate: Transfer learning offers a higher learning rate during training since the problem has already trained for a similar task.
  • Higher accuracy after training: With a better starting point and higher learning rate, transfer learning provides a machine learning model to converge at a higher performance level, enabling more accurate output.
  • Faster training: The learning can achieve the desired performance faster than traditional learning methods since it leverages a pre-trained model.

However, the performance of transfer learning might not be much higher than traditional learning models. The impact of transfer learning cannot be determined until the target model is developed.

What are some example applications of transfer learning?

Technologies

  • Image Recognition: Transfer learning can be used between different image recognition tasks. For example, a model used for identifying dogs can be used for identifying cats.
  • Natural Language Processing (NLP): NLP is one of the most popular applications of transfer learning. for example, the knowledge of pre-trained AI models that can understand linguistic structures can be transferred to other models that aim to predict the next word in a sequence based on previous sentences.
  • Speech Recognition: An AI model developed for English speech recognition can be used as the basis for a German speech recognition model.

Industries

  • Autonomous Driving
    • A model trained for autonomous car driving can be used for autonomous truck driving. 
    • Transfer learning can also be used for detecting different kinds of objects. For example, a model for detecting other cars on the road can be used for detecting motorcycles or buses during autonomous driving.
  • Gaming
    • A model that developed strategies while playing go can be applied to chess. For example, the knowledge of AlphaGo can be transferred to other games, instead of spending time to create new models from scratch.
  • Healthcare:
    • Due to their similar physical natures, transfer learning is possible between Electromyographic (EMG) signals from muscles and Electroencephalographic (EEG) brainwaves for gesture recognition. 
    • Medical imaging is another application of transfer learning. For example, a model that is trained by images from MRI scans can be used as the initial model to analyze CT scans. However, Google shares that transfer learning doesn’t impact machine learning performance significantly in medical imaging tasks yet.
  • Spam filtering: An AI model trained for categorizing emails can be used to filter spams.

There are also open-source trained models like AlexNet and ResNet for data scientists. These models can be used as initial models as a starting point for handling machine learning tasks.

What are the best practices?

  • Take advantage of pre-trained datasets: There are many pre-trained open-source models you can leverage in different fields. Instead of creating new source models, these models can be more reliable (in terms of model architecture), help you save time for building your target model, and prevent you from facing new problems.
  • Be aware of what you use as the source model: You should know your source model and if it is compatible with your target. If your source model is problematic or is unrelated from your target model, achieving the target model can take a longer time.
  • Larger datasets don’t require transfer learning: Although transfer learning improves the performance of machine learning models, it might not give the desired impact, especially with tasks with larger datasets. Traditional learning starts with randomized weights and tunes them until they converge. Transfer learning will begin with a pre-trained model, but larger datasets also lead to more iterations, making your initial weights unimportant.
  • Be careful with overfitting: In tasks with a small amount of data, if the source model is too similar to the target model (like identifying cats and dogs), you might end up with overfitting. Tuning the learning rate, freezing some layers from the source model, or adding linear classifiers while training the target model can help you avoid this issue.

Here is a list of more AI-related articles you might be interested in:

If you have questions on transfer learning, feel free to contact us:

Find the Right Vendors
Access Cem's 2 decades of B2B tech experience as a tech consultant, enterprise leader, startup entrepreneur & industry analyst. Leverage insights informing top Fortune 500 every month.
Cem Dilmegani
Principal Analyst
Follow on

Cem Dilmegani
Principal Analyst

Cem has been the principal analyst at AIMultiple since 2017. AIMultiple informs hundreds of thousands of businesses (as per similarWeb) including 60% of Fortune 500 every month.

Cem's work has been cited by leading global publications including Business Insider, Forbes, Washington Post, global firms like Deloitte, HPE, NGOs like World Economic Forum and supranational organizations like European Commission. You can see more reputable companies and media that referenced AIMultiple.

Throughout his career, Cem served as a tech consultant, tech buyer and tech entrepreneur. He advised businesses on their enterprise software, automation, cloud, AI / ML and other technology related decisions at McKinsey & Company and Altman Solon for more than a decade. He also published a McKinsey report on digitalization.

He led technology strategy and procurement of a telco while reporting to the CEO. He has also led commercial growth of deep tech company Hypatos that reached a 7 digit annual recurring revenue and a 9 digit valuation from 0 within 2 years. Cem's work in Hypatos was covered by leading technology publications like TechCrunch and Business Insider.

Cem regularly speaks at international technology conferences. He graduated from Bogazici University as a computer engineer and holds an MBA from Columbia Business School.

To stay up-to-date on B2B tech & accelerate your enterprise:

Follow on

Next to Read

Comments

Your email address will not be published. All fields are required.

0 Comments