By Dev Mehta
We train a LSTM network to perform English to French Translations. The project has been implemented using TensorFlow and Keras on Python.
This is a complete project code using which someone can translate English to French.
Firstly, we install all the required dependencies and then import all the required libraries to implement the translator. Next, we load the dataset and visualise it to get a better understanding of the data we are working with. The data contains of 137,860 examples. We also check to see if any of the values are Null and then concatenate the English and French dataset side by side.
Now we need to perform data cleaning. We start by removing punctuations such as !,?,#, etc. from the data. We do not remove stop words as each and every word is important for us. We now get the number of unique words used in each of the languages as this is used later on. Next, we find the maximum number of words in a sentence in each of the languages and then apply padding to make all sentences of the same length.
Now we implement the seq2seq part of the model. We apply tokenization to convert texts to sequences of numbers and then visualize it for one example. We also implement a function that does the opposite so that we can visualise the output that we will get after the prediction. For those that do not know what seq2seq is, we basically convert the sentences into a numbers by giving each new word a value and then we train the model on these numbers. We also make a prediction and get the output as numbers and then convert the numbers back to words using the sequence to text (reverse) function that we created.
We then split the dataset into Train and Test sets and then create the model. We first add an embedding layer so that it can learn the low dimensional continuous representation of input discrete variables. Embedding layer somewhat acts like a PCA or auto encoder. This helps the subsequent layers learn more effectively with less resources. We then add an LSTM layer with 256 nodes and then a repeat vector followed by another LSTM layer. We use a repeat vector when we want to iteratively generate something but we only have one input. We then add a Dense softmax layer and use the Adam optimizer with the metrics as accuracy and loss as sparse categorical cross entropy.
Now we finally train our model and have achieved an accuracy of 99.32%. We predict the output for the test set and visualise 5 of them and we can see that most of them are accurately translated.
We save the model along with the weights so that it can be loaded easily later on. The code to load the model and make predictions using this model has also been provided.
Submitted by Dev Mehta (devmehta)
Download packets of source code on Coders Packet
Comments