We create a Multinomial Naive Bayes model and a logistic regression model to train a toxic comment classifier. The project is in Python using sklearn library.
We create a Multinomial Naive Bayes model and a logistic regression model to train a toxic comment classifier.
About the Dataset:
Toxic comment classification- The dataset has 312735 comments. Out of these the training set has 1,59,571 comments while the training set has 1,53,164 comments. These comments are classified into 6 toxic behaviours. The classes are “Toxic, Severe toxic, Obscene, Threat, Insult, Identity hate”.
The dataset is available at this website and the code can be executed after the files are uploaded to google drive.
We use TfidfVectorizer(Term frequency inverse document frequency) to tokenize the documents, which learns the vocabulary and document frequency weightings. This algorithm helps in fitting the algorithm for prediction.
We consider two models to solve this problem
1) Multinomial naive bayes model- This is a model which uses word frequency. The frequencies are normalized, and then can be used to compute predictions based on the probabilities depending on the training set.
This model works well where data can be converted into counts. In this model if a word dows does not exist in the training set then the probability estimate will be zero as it is directly linked to the number of occurrences.
2) Logistic regression model- It is a process of modelling the probability of an output for a given input. This model is used for classification problems, to find what sample belongs to which class. This technique is very efficient in linearly separable classes.
ROC AUC- Also known as Area under the receiver operating characteristics is an evaluation metric for a classification model. This curve is a performance measurement of the problem at different settings. ROC is a probability curve AUC is the measure of separability.
1) Multinomial Bayes Naive model:
2) Logistic regression:
We use the logistic regression model because of the better performance metric.
On the given example “I will kill you”
The classification given is
1) Toxic- 99.71%
2) Severe toxic- 92.49%
3) Obscene- 70.60%
4) Threat- 99.99%
5) Insult- 71.88%
6) Identity hate- 34.15%
1) Can be used by authorities to monitor toxic comments on the internet.
2) Can be used by authorities to monitor cyber bullying.
3) Can be used as automated detection to prevent mass identity hate on social media.
4) This model can be modified to be a spam filter