• Category
  • >Deep Learning
  • >NLP

How to Apply Transfer Learning on Cifar-10 Using Convolutional Neural Network?

  • Tanesh Balodi
  • Sep 02, 2021
How to Apply Transfer Learning on Cifar-10 Using Convolutional Neural Network? title banner

Transfer Learning

 

Isn’t it true that your experience on a cycle helps you to learn to ride a bike or scooter too? Your experience in mathematics helps you to solve the problems of physics? Just how we humans use our previous experience to learn something new, transfer learning also does the same process on machines. 

 

In transfer learning, the model is reused in order to learn some other tasks, therefore, the result of one model is used as a starting point for the next one.

 

This means that a model can take help from some other model in order to improve its efficiency and refinement. The neural network on natural images shows a uniform phenomenon, i.e, the first layer tends to learn the same features as Gabor filters, therefore, this feature could be used as a starting point or the standard feature for many datasets.

 

From image classification to sentiment analysis, transfer learning is being used more and more by the researchers. Transfer learning on images can be done by using CNN as a feature extractor or by using a pre-trained network to reduce the training time.

 

(Must read: How does Basic Convolution Work for Image Processing?)


 

About Cifar 10

 

Cifar-10 (download using the attached link) is an abbreviation for Canadian Institute for Advanced Research, it is a dataset that includes the collection of over 60 thousand images, there are 10 classes among these 60 thousand images, each class has 6 thousand images. These classes contain images of airplanes, automobiles, birds, cats, deer, dogs, frogs, horses, ships, and trucks. 

 

Cifar-10 is one of the most widely used datasets to solve problems related to computer vision. This dataset is used by many researchers to conduct research on computer vision algorithms, as this dataset contains low-resolution images of 32*32 pixels, it becomes easier for the researchers to conduct their tests on this dataset.

 

Few layers of the convolutional neural network learn almost the same features all the time, therefore these features can be reused for another convolutional neural network, this will not only reduce the training time but also works with limited data resources. We will also implement this transfer learning method.

 

Implementing Transfer Learning Using Keras 

 

In our first step, we have to import all the libraries that we will require, we will use numpy for numerical computations, matplotlib for data visualization, and cifar-10 dataset from Keras. We will also require a Sequential model from Keras and various layers like Dense, Conv2D, flatten, Activation, and more.


import numpy as np

from matplotlib import pyplot as plt

import time

 

import keras

from keras.datasets import mnist, cifar10

from keras.models import Sequential

from keras.layers import Dense, Conv2D, Flatten, Activation, MaxPool2D, Dropout

from keras.utils import np_utils

 

(X_train, y_train), (X_test, y_test) = cifar10.load_data()

print(X_train.shape, y_train.shape, X_test.shape, y_test.shape)

 

(50000, 32, 32, 3) (50000, 1) (10000, 32, 32, 3) (10000, 1)

 

Loading the cifar-10 dataset and splitting them into training and testing datasets, we can observe that there are 50 thousand images for training and 10 thousand images for testing datasets.

 


below_5_train = (y_train < 5).flatten()

below_5_test = (y_test < 5).flatten()

 

other_5_train = (y_train >= 5).flatten()

other_5_test = (y_test >= 5).flatten()

 


In the above step, we have used .flatten(), which basically flattens the matrix into one dimension, see the below image to understand clearly. 


flatten function in python to convert the matrix to one dimension


We can observe the working of a flatten function in the above figure.

 

plt.figure(0)

 

plt.subplot(2, 2, 1)

plt.imshow(X_train[0])

 

plt.subplot(2, 2, 2)

plt.imshow(X_train[1])

 

plt.subplot(2, 2, 3)

plt.imshow(X_train[2])

 

plt.subplot(2, 2, 4)

plt.imshow(X_train[3])

 

plt.show()

 


image 1


Plotting a few images from the cifar-10 dataset with the help of the matplotlib python library, we can observe a frog, two trucks, and a deer in the above image.

 

X_below_5 = np.concatenate([X_train[below_5_train], X_test[below_5_test]])

y_below_5 = np.concatenate([y_train[below_5_train], y_test[below_5_test]])

 

X_other_5 = np.concatenate([X_train[other_5_train], X_test[other_5_test]])

y_other_5 = np.concatenate([y_train[other_5_train], y_test[other_5_test]])

 

X_below_5.shape, y_below_5.shape, X_other_5.shape, y_other_5.shape

 

((30000, 32, 32, 3), (30000, 1), (30000, 32, 32, 3), (30000, 1))

 

Concatenating X_train, X_test, and y_train, y_test for below 5 and above 5, at last, we can observe the shape of the datasets, an equal number of data points, i.e 30 thousand came out as a result. 

 

print(below_5_test.sum() + below_5_train.sum(),

      other_5_test.sum() + other_5_train.sum()

     )

 

X_below_5 = X_below_5.reshape((-1,32,32,3))

X_other_5 = X_other_5.reshape((-1,32,32,3))

 

y_other_5 = y_other_5 - 5 # To reduce the 5-9 to 0-4 (for num_classes to be 5 rather than 10)

y_below_5 = np_utils.to_categorical(y_below_5)

y_other_5 = np_utils.to_categorical(y_other_5)

print(y_below_5.shape, y_other_5.shape)

 

(30000, 5) (30000, 5)

 


Summing up the sum of the two datasets, after that, we have used the .reshape() function to return a new array of the shape mentioned, .categorical() function is used to convert the labeled data array into a one-hot vector.

 


split_below_5 = int(0.8 * X_below_5.shape[0])

split_other_5 = int(0.8 * X_other_5.shape[0])

 

X_train_below_5 = X_below_5[:split_below_5]

y_train_below_5 = y_below_5[:split_below_5]

 

X_test_below_5 = X_below_5[split_below_5:]

y_test_below_5 = y_below_5[split_below_5:]

 

 

X_train_other_5 = X_other_5[:split_other_5]

y_train_other_5 = y_other_5[:split_other_5]

 

X_test_other_5 = X_other_5[split_other_5:]

y_test_other_5 = y_other_5[split_other_5:]

 

print(X_train_below_5.shape, y_train_below_5.shape)

print(X_train_other_5.shape, y_train_other_5.shape)

 

(24000, 32, 32, 3) (24000, 5)

(24000, 32, 32, 3) (24000, 5)

 

Splitting the dataset and visualizing the shape of the dataset using the pandas library.

 

model_below_5 = Sequential()

 

model_below_5.add(Conv2D(8, 5, input_shape=(32, 32, 3), activation='relu'))

model_below_5.add(Conv2D(16, 5, activation='relu'))

model_below_5.add(MaxPool2D(pool_size=(2, 2)))

model_below_5.add(Conv2D(32, 3, activation='relu'))

model_below_5.add(Flatten())

model_below_5.add(Dropout(0.4))

 

model_below_5.add(Dense(128))

model_below_5.add(Activation('relu'))

 

model_below_5.add(Dense(5))

model_below_5.add(Activation('softmax'))

 

model_below_5.summary()

model_below_5.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

 


image 2


Using the sequential model from Keras, three convnet layers with relu activation function, layers like flatten, dense, dropout( sets the input units to 0 in order to prevent overfitting), two activation functions ( relu and softmax) are used to compile the convolutional neural network.


t0 = time.time()

hist_below_5 = model_below_5.fit(X_train_below_5, y_train_below_5,

                         epochs=5,

                         shuffle=True,

                         batch_size=100,

                         validation_data=(X_test_below_5, y_test_below_5),

                    )

 

print("Time Taken: ", time.time() - t0)

image 3


Fitting the dataset to the model and printing the time taken by 24000 samples, 5 epochs means that model will train 5 times, i.e, 24000 samples in one epoch.

for l in model_below_5.layers[:6]:

    l.trainable = False

    

for l in model_below_5.layers:

    print (l.trainable)



False

False

False

False

False

False

True

True

True

True



model_other_5 = Sequential(model_below_5.layers[:6])

 

model_other_5.add(Dense(128))

model_other_5.add(Activation('relu'))

model_other_5.add(Dense(5))

model_other_5.add(Activation('softmax'))

 

model_other_5.summary()

model_other_5.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])


image 4


Using the previous model to be used as a starting point on another model to enhance the efficiency and reduce the time taken for training. This is where the transfer learning will come to work.

 

t0 = time.time()

hist_other_5 = model_other_5.fit(X_train_other_5, y_train_other_5,

                         epochs=5,

                         shuffle=True,

                         batch_size=100,

                         validation_data=(X_test_other_5, y_test_other_5),

                    )

 

print("Time Taken: ", time.time() - t0)

image 5


The same number of epochs and batch size is taken to train the model where we have used transfer learning, but we can clearly observe that not only the accuracy of the model has significantly increased, but the time taken to train the model is also less than halved as it was.


 

Conclusion

 

In the first half of the blog, we learned about the basics of transfer learning in order to move forward with the implementation of transfer learning on the Cifar-10 dataset with the help of a convolutional neural network, the second half of the blog uncovers the implementation, where we have learned about the dataset, split the labels, and used two convolutional neural networks to perform the transfer learning. 

 

(Also read: 5 Architectures of convolution neural networks)

 

The output of both models, one without transfer learning and one with transfer learning is something that shows the importance of transfer learning in the training and the accuracy of deep learning algorithms.

Latest Comments