Coders Packet

Image classification using CNN in Python using Tensorflow and Keras

By Aditya Tulsiyan

We create a convolutional neural network to train on the CIFAR-10 and MNIST datasets to classify data into corresponding classes. The project is in Python using Tensorflow and Keras.

Convolutional Neural Networks in Python using Tensorflow and keras.

We model a convolutional neural network to train on the CIFAR-10 and MNIST datasets to classify data into corresponding classes. 

 

About the datasets:

1) Cifar-10 dataset

This dataset consists 60,000 images of 32*32 pixels. There are 10 classes in the dataset and each class contains 6000 images. The given images are coloured with 3 channels i.e. RGB.

The classes are “Airplane, Automobile, Bird, Cat, Deer, Dog, Frog, Horse, Ship, Truck”. 10,000 images are used for the test set and 50,000 images for training images.

The classes are mutually exclusive.

 

2)  MNIST dataset

This dataset consists 70,000 images of 28*28 pixels. There are 10 classes in the dataset and each class contains 6000 images. The given images are grey-scaled.

The classes are digits in the range of [0-9]. 10,000 images are used for the test set and 60,000 images for training images.

The images are size normalized and centered in a fixed-size image.

 

Preprocessing:

The dataset is already processed and we only normalize it. We change the encoding to one hot encoding. 

 

Models

The model we use is Conv2D -> BatchNormalization -> Conv2D -> BatchNormalization -> MaxPooling -> Dropout]x3 -> Flatten -> Dense -> Dropout -> Dense.

This is a 22 layer neural network.

Layers

1) Conv2D layer: This is a 2D Convolutional layer, in which a filter of n*n height and width is used.

2) BatchNormalization layer: This is a layer which is used to normalize the inputs and reduce overfitting of data. It is done by calculating the mean and variance.

3) MaxPooling2D layer: It is a method used to downsample the input. It takes the maximum value over a filter passed of n*n filter size.

4) Dropout layer: This is a method which is used to prevent overfitting. In this method some of the input layers are randomly set to 0.

5) Flatten layer: This filter is used to convert the input data into a 1D array. It does not change the batch size.

6) Dense layer: This layer is a normal densely connected neural network layer.


Cifar-10 dataset:

There are a total of 5,52,362 parameters, out of which 5,51,466 are trainable while 896 are not trainable.

The [L-1] layer has 1290 parameters.

 

Mnist Dataset:

There are a total of 4,37,098 parameters, out of which 4,36,202 are trainable while 896 are not trainable.

The [L-1] layer has 1290 parameters.

 

Training

1) Cifar 10 Dataset

We train the dataset with 70 epochs. 

2) Mnist dataset

We train the dataset with 30 epochs.

 

Results

1)Cifar10 dataset

Training set accuracy:

While training we see that after the first iteration the accuracy is 31.90%, after we train it with 10 iterations it increases to 78.88%.

As the number of epoch increases, the accuracy increases significantly.

20- 85.66%

40-89.93%

70-92.41%

Test set Accuracy: 86.86%

 

2) Mnist dataset

Training set accuracy:

While training we see that after the first iteration the accuracy is 98%

10- 99.44%

Test set Accuracy: 99.37%

 

Applications

1) Cifar10 dataset

a)Image recognizer

 2) Mnist dataset

The mnist dataset recognizer is an algorithm which can recognize handwritten digits. This algorithm can be used for number of applications such as

  1. Automating Cheque clearing
  2. Post-card automations
  3. Scanning digits in any area.







Download project

Reviews Report

Submitted by Aditya Tulsiyan (Adi1423)

Download packets of source code on Coders Packet