• Category
  • >Machine Learning

What is Transfer learning in Machine Learning? Six Steps of Transfer Learning

  • Yashoda Gandhi
  • Jul 16, 2022
What is Transfer learning in Machine Learning? Six Steps of Transfer Learning title banner

Transfer learning is the application of a previously trained model to a new problem. It is currently very popular in deep learning because it can train deep neural networks with a small amount of data. This is extremely useful in the field of data science because most real-world problems do not have millions of labeled data points to train such complex models.


 

What is Transfer Learning?

 

Transfer Learning is a machine learning method in which knowledge obtained from a model used in one task is reused as the foundation for another task.

 

Machine learning algorithms make predictions and generate new output values by using historical data as input. They are typically intended to perform isolated tasks. A source task transfers knowledge from one task to another. A target task occurs when improved learning occurs as a result of knowledge transfer from a source task.

 

During transfer learning, the knowledge gained and rapid progress from a source task is used to improve learning and development for a new target task. The knowledge is applied by utilizing the attributes and characteristics of the source task, which will be applied and mapped onto the target task.

 

A negative transfer, on the other hand, occurs when the transfer method results in a decrease in the performance of the new target task. One of the most difficult challenges when working with transfer learning methods is providing and ensuring positive transfer between related tasks while avoiding negative transfer between less related tasks.

 

How to Transfer Learning Functions

 

Neural networks in computer vision typically seek to detect edges in the first layer, forms in the middle layer, and task-specific features in the later layers. Transfer learning is used in the early and central layers, while the later layers are only retrained. It employs the labeled data from the task on which it was trained.

 

Return to the example of a model that was designed to detect a backpack in an image but is now being used to detect sunglasses. Because the model has previously been trained to recognize objects, we will simply retrain the subsequent layers to understand what distinguishes sunglasses from other objects.


 

When Should You Use Transfer Learning?

 

Transfer Learning is used when there is insufficient annotated data to train our model. When a pre-trained model that has been trained on similar data and tasks is available. If you trained the original model with TensorFlow, you could simply restore it and retrain some layers for your job. 

 

Transfer learning, on the other hand, works only if the features learned in the first task are general, in the sense that they can be applied to another activity. In addition, the model's input must be the same size as when it was first trained. Below we’ve noted down some of the instances where transfer learning is used.

 

  1. Readying a model for use

 

Consider the following scenario: you want to tackle Task A but don't have enough data to train a deep neural network. Finding a related task B with a large amount of data is one way around this.

 

Train the deep neural network on task B before applying the model to task A. The problem at hand will determine whether you need to use the entire model or just a few layers.

 

If the input is the same in both jobs, you can reapply the model and make predictions for your new input. On the other hand, changing and retraining distinct task-specific layers and the output layer is an approach to investigate.

 

  1. Applying a Prepared Model

 

The second possibility is to use a previously trained model. There are several of these models available, so do some preliminary research. The task determines the number of layers to reuse and retrain.

 

Keras is made up of nine pre-trained models that are used for transfer learning, prediction, and fine-tuning. These models, as well as some quick tutorials on how to use them, are available here. Many research institutions also make trained models available to the public. Deep learning is the most common application of this type of transfer learning.

 

  1. Feature Extraction

 

Another option is to use deep learning to find the best representation of your problem by identifying the key features. This method is known as representation learning, and it can frequently outperform hand-designed representations.

 

Machine learning feature creation is mostly done by hand by researchers and domain experts. Deep learning, fortunately, can automatically extract features. Of course, this does not negate the significance of feature engineering and domain knowledge; you must still decide which features to include in your network.

 

Neural networks, on the other hand, can learn which features are important and which aren't. A representation learning algorithm can find a good combination of characteristics in a short amount of time, even for complicated tasks that would otherwise require a lot of human effort.

 

The learned representation can then be used to solve a variety of other problems. Simply use the first layers to find the best feature representation, but avoid using the network output because it is too task-specific. Send data into your network and then out through one of the intermediate layers.

 

This layer can then be understood as a representation of the raw data. This method is popular in computer vision because it can shrink your dataset, reducing computation time and making it more suitable for traditional algorithms.

 

 

Types of Transfer Learning

 

Below are the types of transfer learning :

 

  1. Inductive Transfer Learning: The source and target tasks are the same in this type of transfer learning, but they are still distinct. 

 

The model will use inductive biases from the source task to help improve target task performance. The source task may or may not contain labeled data, which leads to the model based on multitask learning and self-taught learning.

 

  1. Unsupervised Transfer Learning: If you don't know what unsupervised learning is, it's when an algorithm is trained to identify patterns in datasets that haven't been labeled or classified. 

 

In this case, the source and target are similar, but the task is different because both the data in the source and target are unlabeled. Unsupervised learning techniques such as dimensionality reduction and clustering are well known.

 

  1. Transductive Transfer Learning: In this type of transfer learning, the source and target tasks are similar, but the domains are not. The source domain has a lot of labeled data, whereas the target domain has none, leading to the model using domain adaptation.

 

Also Read | Transfer Learning on Cifar-10


 

Applications for Transfer Learning

 

Transfer learning allows data scientists to learn from the knowledge gained from a previously used machine learning model for a similar task. This is why this technique is now being used in the fields listed below.

 

  1. NLP

 

NLP is one of the most appealing transfer learning applications. Transfer learning solves cross-domain tasks by leveraging the knowledge of pre-trained AI models that understand linguistic structures. 

 

Deep learning models such as BERT, XLNet, Albert, TF Universal Model, and others are used in everyday NLP tasks such as next word prediction, question answering, and machine translation.

 

  1. Computer Vision

 

Transfer learning in computer vision is also used in image processing. Deep Neural Networks are used to solve image-related tasks because they are capable of detecting complex image features. Because the dense layers contain the image detection logic, tuning the higher layers does not affect the base logic. 

 

Transfer learning is commonly used in image recognition, object detection, image noise removal, and other image-related tasks because all image-related tasks require basic knowledge and pattern detection of familiar images.

 

  1. Audio/Speech

 

Learning algorithms for audio/speech transfer are used to solve audio/speech-related tasks such as speech recognition and speech-to-text translation. When we say "Siri" or "Hey Google!" the primary AI model developed for English speech recognition is busy at the backend processing our commands.

 

Surprisingly, a pre-trained AI model developed for English speech recognition serves as the foundation for a model for French speech recognition.


 

Transfer Learning in Six Steps


The image gives Six Step Transfer Learning Procedure - Obtain the pre-trained model, Create a base model, Freeze layers, Train the new layers on the data sheet and Improving the model via fine-tuning

Six Step Transfer Learning Procedure


Finally, let's go over how transfer learning works in practice.

 

  1. Get a pre-trained model

 

The first step is to decide which pre-trained model we want to use as the foundation of our training, depending on the task. To be compatible, transfer learning requires a strong correlation between the knowledge of the pre-trained source model and the target task domain.

 

In terms of computer vision:

 

  • VGG-16
  • VGG-19
  • V3 of Inception
  • XCeption
  • ResNet-50

 

Regarding NLP tasks:

 

  • Word2Vec
  • GloVe
  • FastText

 

  1. Make a Base Model

 

The base model is one of the architectures, such as ResNet or Xception, that we chose in the first step because it is closely related to our task. We can either download the network weights, which saves time on additional model training. Otherwise, we'll have to start from scratch with the network architecture.

 

In some cases, the base model may have more neurons in the final output layer than we require for our use case. In such cases, we must remove the final output layer and modify it accordingly.

 

  1. Layers should be Frozen

 

To avoid the additional work of making the model learn the basic features, it is necessary to freeze the starting layers from the pre-trained model.

 

If we do not freeze the initial layers, we will lose all of the previous learning. This is equivalent to training the model from scratch and represents a waste of time, resources, and other resources.

 

  1. Create new Trainable Layers

 

The feature extraction layers are the only knowledge we reuse from the base model. To predict the model's specialized tasks, we must add additional layers on top of them. These are the final output layers in most cases.

 

  1. From the new layers

 

The final output of the pre-trained model will almost certainly differ from the output we want for our model. Pre-trained models trained on the ImageNet dataset, for example, will produce 1000 classes.

 

However, our model must apply to two classes. In this case, we must train the model with a new output layer.

 

  1. Tweak your model

 

Fine-tuning is one method of improving performance. Fine-tuning entails unfreezing a portion of the base model and training the entire model on the entire dataset again at a very low learning rate. The low learning rate will improve the model's performance on the new dataset while preventing overfitting.

 

Also Read | How is Transfer Learning done in Neural Networks and CNN?

 

 

The Future of Transfer Learning

 

The future of machine learning is dependent on different organizations and businesses having widespread access to powerful models. 

 

Machine learning must be accessible and adaptable to the distinct local needs and requirements of organizations to revolutionize businesses and processes. Only a small percentage of businesses will have the expertise or resources to label data and train a model.

 

The main challenge is gathering large amounts of labeled data for the supervised machine learning process. Labeling data can be a time-consuming process, especially when dealing with large amounts of data. The requirement for large amounts of labeled data makes the widespread development of the most powerful models impossible.

 

Algorithms are likely to be developed centrally by organizations that have access to and resources for the massive amounts of labeled data required. However, when these models are deployed by other organizations, performance can suffer because each environment is slightly different from the one for which the model was trained. 

 

In practice, even the most highly accurate models can have an impact on performance. This could be a barrier to the adoption of machine learning products and solutions.

 

Transfer learning will be critical in resolving this issue. Transfer learning techniques will allow powerful machine learning models developed at scale to be adapted for specific tasks and environments. Transfer learning will be a key driver of machine learning model distribution across new areas and industries.

 

To summarize, transfer learning provides many exciting research directions, as well as many applications that require models that can transfer knowledge to new tasks and adapt to new domains. 

 

Transfer learning is commonly used to save time and resources by eliminating the need to train multiple machine learning models from scratch to perform similar tasks. as a cost-cutting measure in areas of machine learning that require a lot of resources, such as image classification or natural language processing.

Latest Comments

  • Qurani Wazifa For Bandish

    Jul 18, 2022

    Nice Thanks for sharing