Sign Language Detection in PYTHON using KERAS and CNNs

This Python Project Learns the American Sign Language Digits and then predicts using images it has never seen before. The Model used is made using Convolution Neural Networks in Keras.

The Project code is available in notebook format as well as Python script.

DATASET

The Dataset has been taken from Kaggle and is automatically downloaded inside the project. (The user needs to have kaggle Api setup in its computer. For more info, refer https://github.com/Kaggle/kaggle-api#:~:text=API%20credentials,file%20containing%20your%20API%20credentials.)

Link for the dataset for manual downloading( NOT RECOMMENDED ) - https://www.kaggle.com/datamunge/sign-language-mnist

Link of the dataset for manual downloading( GOOGLE DRIVE ) - LINK

PROCEDURE

We will perform model training in 3 steps:

1. Downloading Data from the net.

2. Extracting data into a useful form.

3. Preprocessing the data so performance increases.

4. Defining the model for training.

5. Training the model with the data.

6. Evaluating success.

1. Downloading DATA

This is the main API that is responsible for downloading the data from the net onto the local machine. It downloads the data in a zip format.

!kaggle datasets download -d datamunge/sign-language-mnist

To extract the data into the same location -

zip_path = os.path.join(folder_path, "sign-language-mnist.zip")
zipper = zipfile.ZipFile(zip_path)
zipper.extractall(folder_path)

where folder_path is the location of the folder.

2. Extracting data into a useful form

Reading the data in pandas dataframe and converting into NumPy arrays -

raw_data = pd.read_csv(csv_path)
raw_data = raw_data.astype(np.float32)
arr = np.array(raw_data)

Splitting the data into training set and validation set -

split = int(0.8*len(arr))

train_label = arr[:split,0]
train_data = arr[:split,1:]

val_label = arr[split:,0]
val_data = arr[split:,1:]

Reshaping the data into correct shape so it can be fed into the model -

train_data = train_data.reshape(-1, 28, 28,1)
val_data = val_data.reshape(-1, 28, 28,1)

The same steps are repeated for TEST DATA.

3. Preprocessing the data

Reducing the Data magnitude so it can be easily processed by the model -

train_data = train_data/(255)
val_data = val_data/(255)
test_data = test_data/(255)

4. Defining the model for training

Defining Model includes 2 Steps-

1. Creating the Model.

2. Compiling the model with appropriate loss and optimizer.

CREATING THE MODEL -

model = keras.Sequential([
    layers.Conv2D(32, (5,5), activation = 'relu', input_shape = (28,28,1)),
    layers.MaxPooling2D(2),
    layers.Conv2D(64, (3,3), activation = 'relu', padding = "SAME"),
    layers.MaxPooling2D(2),
    layers.Conv2D(128, (3,3), activation = 'relu', padding = "SAME"),
    layers.Conv2D(128, (3,3), activation = 'relu', padding = "SAME"),
    layers.MaxPooling2D(2),
    layers.Conv2D(256, (3,3), activation = 'relu', padding = "SAME"),
    layers.Conv2D(256, (3,3), activation = 'relu', padding = "SAME"),
    layers.Flatten(),
    layers.Dense(64, activation = 'relu'),
    layers.Dropout(0.4),
    layers.Dense(128, activation = 'relu'),
    layers.Dropout(0.4),
    layers.Dense(25, activation = 'softmax')
])

The end layer has 25 outputs leading to 24 letters and a default category. (2 letters are not present because they are gestures)

The model has 1,285,081 trainable parameters.

COMPILING THE MODEL -

model.compile(loss = 'sparse_categorical_crossentropy', optimizer = 'rmsProp', metrics = ['acc'])

The model is compiled using RMSProp Optimizer and Sparse Categorical Cross-Entropy Loss.

5. Training the model with the data

history = model.fit(train_data, train_label, epochs = 20, batch_size = 256,
                        validation_data = (val_data, val_label), verbose = 1)

The model is trained for 20 epochs and validated on the validation data.

6. Evaluating success

history2 = model.evaluate(test_data,test_label)

Coders Packet