Training machine learning models can be a challenging data science tasks. The training algorithms might not work as intended, training times can take too long, or training data can be problematic. Transfer learning is one of those techniques to make training easier. Just like humans can transfer their knowledge on one topic to a similar one, transfer learning can provide data scientists to transfer insights gained from a machine learning task into a similar one. By that, they can shorten machine learning model training time and rely on fewer data points.
What is transfer learning?
Transfer learning is a machine learning technique that enables data scientists to benefit from the knowledge gained from a previously used machine learning model for a similar task. This learning takes humans’ ability to transfer their knowledge as an example. If you learn how to ride a bicycle, you can learn how to drive other two-wheeled vehicles more easily. Similarly, a model trained for autonomous driving of cars can be used for autonomous driving of trucks. Wikipedia defines transfer learning as follows:
Transfer learning is a research problem in machine learning that focuses on storing knowledge gained while solving one problem and applying it to a different but related problem.
This technique is applicable to many machine learning models, including deep learning models like artificial neural networks and reinforcement models. The target is to leverage a pre-trained model’s knowledge while performing a different task. For better understanding, here is a figure showing how transfer learning differs from traditional machine learning methods:
How does it work?
There are 2 similar approaches:
Using a pre-trained model
Here is how transfer learning works:
- Select source model: A pre-trained source model is chosen for transferring its knowledge to the target model.
- Adapting the source model to create the target model: Some features of the source model can differ from the training data of the target model. Thus, different features should be arranged before transferring knowledge.
- Train source model to achieve the target model: After the source model is tuned, the target model is achieved by using the source model as the starting point.
Developing a new model
In some cases, data scientists may decide to create a new model to transfer its knowledge to the main task. For example, you need different models to detect trucks and buses in images, but you don’t have enough data. Then, you can prefer generating a new model that identifies cars at first. The generated model can be a starting point of identifying trucks and buses and be explicitly trained with the available data.
When to use transfer learning?
For transfer learning, data scientists need to have a machine learning model trained on a similar task before. Thus, they cannot use transfer learning in every situation. However, when it is used, transfer learning delivers better results in a shorter amount of time. Data scientists can adopt transfer learning to their operations in the following conditions:
There is not enough data
In some cases, data scientists might not have enough data to train their machine learning models. Working with an insufficient amount of data would result in lower performance, starting with a pre-trained model would help data scientists build better models.
There is not enough time to train
Some machine learning models cannot be trained easily and can take too long to work properly. When there is not enough time to build a new model, or there are too many machine learning tasks to handle, data scientists can prefer to adopt a similar pre-trained model. This will save time for building the model, instead of creating a new one.
Have we reached the peak?
When we look at the last five years, we can observe an increasing interest in transfer learning since 2017. As technology has become popular recently, we expect this increasing trend to continue for a few years. The increasing trend of transfer learning can be due to the advances in machine learning and the increasing applications of transfer learning in real-life.
What are the main benefits?
According to this book about machine learning, transfer learning provides the following benefits:
- Better initial model: In other types of learning, you need to build a model without any knowledge. Transfer learning offers a better starting point and can perform tasks at some level without even training.
- Higher learning rate: Transfer learning offers a higher learning rate during training since the problem has already trained for a similar task.
- Higher accuracy after training: With a better starting point and higher learning rate, transfer learning provides a machine learning model to converge at a higher performance level, enabling more accurate output.
- Faster training: The learning can achieve the desired performance faster than traditional learning methods since it leverages a pre-trained model.
However, the performance of transfer learning might not be much higher than traditional learning models. The impact of transfer learning cannot be determined until the target model is developed.
What are some example applications of transfer learning?
- Image Recognition: Transfer learning can be used between different image recognition tasks. For example, a model used for identifying dogs can be used for identifying cats.
- Natural Language Processing (NLP): NLP is one of the most popular applications of transfer learning. for example, the knowledge of pre-trained AI models that can understand linguistic structures can be transferred to other models that aim to predict the next word in a sequence based on previous sentences.
- Speech Recognition: An AI model developed for English speech recognition can be used as the basis for a German speech recognition model.
- Autonomous Driving
- A model trained for autonomous car driving can be used for autonomous truck driving.
- Transfer learning can also be used for detecting different kinds of objects. For example, a model for detecting other cars on the road can be used for detecting motorcycles or buses during autonomous driving.
- A model that developed strategies while playing go can be applied to chess. For example, the knowledge of AlphaGo can be transferred to other games, instead of spending time to create new models from scratch.
- Due to their similar physical natures, transfer learning is possible between Electromyographic (EMG) signals from muscles and Electroencephalographic (EEG) brainwaves for gesture recognition.
- Medical imaging is another application of transfer learning. For example, a model that is trained by images from MRI scans can be used as the initial model to analyze CT scans. However, Google shares that transfer learning doesn’t impact machine learning performance significantly in medical imaging tasks yet.
- Spam filtering: An AI model trained for categorizing emails can be used to filter spams.
There are also open-source trained models like AlexNet and ResNet for data scientists. These models can be used as initial models as a starting point for handling machine learning tasks.
What are the best practices?
- Take advantage of pre-trained datasets: There are many pre-trained open-source models you can leverage in different fields. Instead of creating new source models, these models can be more reliable (in terms of model architecture), help you save time for building your target model, and prevent you from facing new problems.
- Be aware of what you use as the source model: You should know your source model and if it is compatible with your target. If your source model is problematic or is unrelated from your target model, achieving the target model can take a longer time.
- Larger datasets don’t require transfer learning: Although transfer learning improves the performance of machine learning models, it might not give the desired impact, especially with tasks with larger datasets. Traditional learning starts with randomized weights and tunes them until they converge. Transfer learning will begin with a pre-trained model, but larger datasets also lead to more iterations, making your initial weights unimportant.
- Be careful with overfitting: In tasks with a small amount of data, if the source model is too similar to the target model (like identifying cats and dogs), you might end up with overfitting. Tuning the learning rate, freezing some layers from the source model, or adding linear classifiers while training the target model can help you avoid this issue.
Here is a list of more AI-related articles you might be interested in:
- Artificial Intelligence (AI): In-depth Guide
- State of AI technology
- Future of AI according to top AI experts
- Advantages of AI according to top practitioners
- AI in Business: Guide to Transforming Your Company
- Self-Supervised Learning: In-depth Guide
If you have questions on transfer learning, feel free to contact us: