Food Item Classifier in Python using Tensorflow

Food Item Classifier in Python using Tensorflow/

Classification of 101 food items from the dataset food101 in Python using TensorFlow.

In this project, we will learn how to classify different objects using neural networks made from scratch. We will classify 100 food items.

Image Classification generally requires a big dataset, in this case, the dataset is relatively small when it comes to individual food items. Therefore we will be using a pre-trained model Resnet152V2. We will fine-tune the last 15 layers and add a few more layers.

DATASET

The food dataset can be found here food101.

DEPENDENCIES

First and most import is installing the dependencies and then importing libraries. Here we will be using quite a few for eg: TensorFlow, matplotlib etc.

pip install tensorflow
pip install matplotlib
pip install numpy

Numpy is optional.

Now that we done with the dependencies we can go forward with installing the libraries.

Installing Libraries

import numpy as np
import tensorflow as tf
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras import layers
from tensorflow.keras import Model

With the libraries installed we can now move forward to preprocessing our data.

Data Preprocessing

In this code we will keep the image height and width relatively moderate. Batch size of 100 that means it will take 100 images at a time.

batch_size = 100
img_h = 128
img_w = 128
num_clas=100

Now to define the name of the classes.

clas = ['apple_pie', 'baby_back_ribs', 'baklava', 'beef_carpaccio', 'beef_tartare', 'beet_salad', 'beignets', 'bibimbap',
           'bread_pudding', 'breakfast_burrito', 'bruschetta', 'caesar_salad', 'cannoli', 'caprese_salad', 'carrot_cake',
           'ceviche', 'cheese_plate', 'cheesecake', 'chicken_curry', 'chicken_quesadilla','chicken_wings','chocolate_cake',
           'chocolate_mousse','churros', 'clam_chowder', 'club_sandwich', 'crab_cakes', 'creme_brulee', 'croque_madame',
           'cup_cakes', 'deviled_eggs', 'donuts', 'dumplings', 'edamame', 'eggs_benedict', 'escargots', 'falafel', 'filet_mignon'
           'fish_and_chips', 'foie_gras', 'french_fries', 'french_onion_soup', 'french_toast', 'fried_calamari', 'fried_rice',
           'frozen_yogurt', 'garlic_bread', 'gnocchi', 'greek_salad', 'grilled_cheese_sandwich', 'grilled_salmon', 'guacamole',
           'gyoza', 'hamburger', 'hot_and_sour_soup', 'hot_dog', 'huevos_rancheros', 'hummus', 'ice_cream', 'lasagna', 'lobster_bisque',
           'lobster_roll_sandwich', 'macaroni_and_cheese', 'macarons', 'miso_soup', 'mussels', 'nachos','omelette', 'onion_rings',
           'oysters', 'pad_thai', 'paella', 'pancakes', 'panna_cotta', 'peking_duck', 'pho', 'pizza', 'pork_chop', 'poutine',
           'prime_rib', 'pulled_pork_sandwich', 'ramen', 'ravioli', 'red_velvet_cake', 'risotto', 'samosa', 'sashimi', 'scallops',
           'seaweed_salad', 'shrimp_and_grits', 'spaghetti_bolognese', 'spaghetti_carbonara', 'spring_rolls', 'steak', 'strawberry_shortcake',
           'sushi', 'tacos', 'takoyaki', 'tiramisu', 'tuna_tartare', 'waffles' ]

These are all the names of the food items we are trying to classify.

Train and Validation Datagen

Even though our dataset is big, our images per food item is less. That's why we are going to use ImageGenerator which is used for data augmentation, to extend our dataset by randomly zooming or rotating our training images etc.

train_datagen = ImageDataGenerator(rotation_range=50,
                                    width_shift_range=0.2,
                                    height_shift_range=0.2,
                                    zoom_range=0.3,
                                    horizontal_flip=True,
                                    vertical_flip=True,
                                    fill_mode='constant',
                                    cval=0,
                                    rescale=1./255)
valid_datagen = ImageDataGenerator(rotation_range=45,
                                    width_shift_range=0.2,
                                    height_shift_range=0.2,
                                    zoom_range=0.3,
                                    horizontal_flip=True,
                                    vertical_flip=True,
                                    fill_mode='constant',
                                    cval=0,
                                    rescale=1./255)

test_datagen = ImageDataGenerator(rescale=1./255)

Loading the dataset

train_dir = '../input/food101/training'
val_dir = '../input/food101/validation'

Preprocessing and Rescaling

We selecting a random seed value so that our shuffle data remains consistent. Rescaling to 128x128 which is the size we want. Dividing it in batches so that training becomes easy.

seeed = 10
train_gen = train_datagen.flow_from_directory(train_dir,
                                              batch_size = batch_size,
                                              target_size = (128,128),
                                              classes = clas,
                                              class_mode = 'categorical',
                                              shuffle = True,
                                              seed = seeed)
valid_gen = valid_datagen.flow_from_directory(val_dir,
                                              batch_size = batch_size,
                                              target_size = (128,128),
                                              classes = clas,
                                              class_mode = 'categorical',
                                              shuffle = True,
                                              seed = seeed)

Loading the pretrained model

We are using Resnet152v2 which is a very deep model, pretraining it on the 'imagenet' dataset. Then fine tuning the last 15 layers for our dataset.

from tensorflow.keras import regularizers
from tensorflow.keras.layers import GlobalAveragePooling2D,Flatten, Dense, Dropout
Res_model = tf.keras.applications.Resnet152V2(weights='imagenet', include_top=False, input_shape=(128, 128, 3))

# The last 15 layers fine tune
for layer in Res_model.layers[:-15]:
    layer.trainable = False

x = Res_model.output
x = GlobalAveragePooling2D()(x)
x = Flatten()(x)
x = Dense(units=512, activation='swish')(x)
x = Dropout(0.3)(x)
x = Dense(units=512, activation='swish')(x)
x = Dropout(0.3)(x)
output  = Dense(units=100, activation='softmax')(x)
model = Model(Res_model.input, output)


model.summary()

After the pretraining model we are adding a few of our layers for better efficiency, here instead of using the generic 'relu' activation function for the Dense layer 'swish' has been used. Swish is a new activation layer by google which outperforms relu in deeper neural networks.

Swish Activation

from keras.backend import sigmoid
def swish(x, beta =1):
    return(x*sigmoid(beta*x))

Compiling the Model

Here we are using Adam optimizer and 'categorical_crossentropy' since we dividing them into categories.

model.compile(optimizer='adam',
              loss='categorical_crossentropy',
              metrics=['accuracy'])

Fitting and optimizing Learning Rate

Now it's time to fit our model. Since this is a very huge model finding the optimum loss becomes a little difficult. That is why the learning rate needs to be changed as the model progresses. Using the ReduceLROnPlateau reduces our learning rate whenever our graph orientation changes by a certain extent. Which in turn helps us to find the optimal and minimum point in the graph for our loss.

from tensorflow.keras.callbacks import ReduceLROnPlateau

learn_rate = ReduceLROnPlateau(monitor='val_accuracy', 
                        patience=3, 
                        verbose=1, 
                        factor=0.4, 
                        min_lr=0.0001)


callbacks = [learn_rate]

STEP_SIZE_TRAIN=train_gen.n//train_gen.batch_size
STEP_SIZE_VALID=valid_gen.n//valid_gen.batch_size
history = model.fit_generator(generator=train_gen,
                   steps_per_epoch=STEP_SIZE_TRAIN,
                   validation_data=valid_gen,
                   validation_steps=STEP_SIZE_VALID,
                   epochs=50,
                   callbacks=callbacks)

Here we training it for 50 epochs. That means 50 times the entire dataset will pass through the network. The number of epochs is high because the dataset is very distinguished hence it takes time for the network to indentify the features.

Evaluating the Model

After 50 epochs our training accuracy reaches around 90% and validation around 85%. Now its time to Evaluate the model.

model.evaluate_generator(generator=valid_gen,steps=STEP_SIZE_VALID)

Plotting the loss and accuracy

After the training we can plot our losses for training and validation along with the accuracy.

%matplotlib inline
import matplotlib.pyplot as plt
acc = history.history['accuracy']
val_acc = history.history['val_accuracy']
loss = history.history['loss']
val_loss = history.history['val_loss']

epochs = range(len(acc))

plt.plot(epochs, acc, 'r', label='Training accuracy')
plt.plot(epochs, val_acc, 'b', label='Validation accuracy')
plt.title('Training and validation accuracy')
plt.legend()
plt.figure()

plt.plot(epochs, loss, 'r', label='Training Loss')
plt.plot(epochs, val_loss, 'b', label='Validation Loss')
plt.title('Training and validation loss')
plt.legend()
plt.show()

Test

Finally we can insert the images of our choice and test the model

def predict_food(imagepath):
    image = cv2.imread(imagepath)
    res = cv2.resize(image, dsize = (150, 150),interpolation=cv2.INTER_AREA)
    res = np.array(res)
    res  = np.expand_dims(res, axis = 0)
    predict = model.predict(res)
    return predict

print (predict_food('Your file directory'))

Coders Packet

Food Item Classifier in Python using Tensorflow

Comments