Basics of machine learning-train and predict iris dataset

This Project mainly focuses on Naive Bayes Algorithm to train Iris dataset and predict the output that is the type of iris plant.

Hello learners!

In this Project we will get to know the very basic of machine learning that is train a model as per the train dataset and predicting the output by giving some data as input.

We will be using Iris Dataset for this which is one of the most popular dataset when it comes to train and predict the output.

Basically iris dataset includes sepal length,sepal width,petal length and petal width of Iris plants which are setosa,versicolor and virginica with all these four characteristic as a input and all these iris plant as an output for example-

sepal length,sepal width,petal length,petal width are 5.1,3.5,1.4,0.2 respectively and it belongs to setosa class.

so in our iris dataset there are lot of these data available and we will be training our model with this data set only.We will be using iris data set in .csv file form so we can easily read this and train this

Let's First start by reading this .csv file-

import pandas as pd


df = pd.read_csv("iris2.csv")

This will read our .csv file of iris dataset.

now let us convert this into a numpy array so that our model can be trained.

import numpy as np

y=np.array(df)

x_train= np.delete(y, [4], axis=1)

y_train=np.delete(y,[0,1,2,3],axis=1).flatten()

Here for x_train we need the sepal length,sepal width ,petal length and petal width so we deleted the last column which contains the output i.e,setosa or versicolor or virginica.

For y_train we need only the output that is setosa or versicolor or virginica so we deleted all other inputs.

.flatten() just converts n-d array to 1 d array.

Now we have our training dataset ready as x_train and y_train.

Now we will be using naive bayes algorithm to train our model.

from sklearn.naive_bayes import GaussianNB

clf=GaussianNB()
clf.fit(x_train,y_train)

.fit() is the function which is used to train our model.

Now our model is trained and it is ready to make some predictions.

We will be doing it user friendly so when the user will enter some inputs then it will be giving some output based on the predictions.

print("enter values to get the predicted values")
sepal_length=float(input("enter sepal length:"))

sepal_width=float(input("enter sepal width:"))
petal_length=float(input("enter petal length:"))
petal_width=float(input("enter petal width:"))

print(clf.predict([[sepal_length,sepal_width,petal_length,petal_width]]
 ))

so in this way it will give the predicted output.

Here is the compelete code-

import numpy as np
import pandas as pd




df = pd.read_csv("iris2.csv")
y=np.array(df)
x_train= np.delete(y, [4], axis=1)

y_train=np.delete(y,[0,1,2,3],axis=1).flatten()

from sklearn.naive_bayes import GaussianNB

clf=GaussianNB()
clf.fit(x_train,y_train)

print("enter values to get the predicted values")
sepal_length=float(input("enter sepal length:"))

sepal_width=float(input("enter sepal width:"))
petal_length=float(input("enter petal length:"))
petal_width=float(input("enter petal width:"))

print(clf.predict([[sepal_length,sepal_width,petal_length,petal_width]]
 ))

We can even use the top 10-20 percent of this data to predict the output and then we can compare them with the true values and we can find accuracy of our model also.

So this is the basic way in which a model is been trained and predictions are made,you can use any algorithm for training of your dataset whichever you think is the best for your model.

Thank you,I hope you all clearly understood the basics of training a model and predicting an output under machine learning.

Coders Packet

Basics of machine learning-train and predict iris dataset

Comments