K Nearest Neighbor from Scratch using Python

By Akhilesh Ketkar

The purpose of this project was to present the inner workings of K Nearest Neighbor for better understanding using Python.

Coding K Nearest Neighbor algorithm from Scratch using numpy in Python


First, We need to install the numpy and Collection package using "Pip install numpy " and "Pip install Collection" in Command prompt. This line of code will import numpy and Collection.

import numpy
import collections

KNN is a lazy learner, so we need to specify the number of the nearest neighbors. In the fit method, we will accept training lables and data.

def __init__(self,k=3):
self.k = k

def fit(self,X,y):
        self.X_train = X
        self.Y_train = y

we need a helper function to calculate the similarity between two neighbors.

def _euclidean_distance(self,x1,x2):  
        #Calculating euclidean distance 
        return np.sqrt(np.sum((x1-x2)**2))

We'll perform the following steps for every element:

          1) calculate the euclidean distance between with unknown class and known class

          2) select neighbors with the smallest euclidean distance

          3) choose the most common class.

def predict(self,X_predict):   
        y_predict = []
        #We need to predict every value individually
       for x in X_predict:   
           distances = [self._euclidean_distance(x, x_train) for x_train in self.X_train]  
           nearest_neighbor_indx =  np.argsort(distances)[:self.k]  
           nearest_neighbor_labels = [self.Y_train[label_indx] for label_indx in nearest_neighbor_indx]
            nearest_neighbor = Counter(nearest_neighbor_labels).most_common()[0][0]
        return np.array(y_predict)


