In this tutorial we will learn unsupervised learning algorithm: KMeans clustering using Python. This algorithm categorises the items into k groups of similarity.
Kmeans clustering is an unsupervised learning algorithm. This algorithm categorizes the items into k groups of similarity. We calculate the similarity using Euclidean distance as measurement.
The algorithm working is explained below:
There are a lot of options available to initialize this means one method is to initialize the means at random items in the data set or the means are initialized at random values between the boundaries of the data set.
Import all the required libraries to the python notebook
Here, we are taking a random data and performing the clustering algorithm.
The above code is for visualising the data points given in the data set. The blue colour dots represents the data.
We choose the number of clusters as 2. Applying k means to the dataset also plotting the centroids of the clusters.
After the execution of the codes, we observe plotted centroids of the cluster in the graph. The Red Dots represents the Centroids. In this way, you can perform the Kmeans algorithm on any given dataset using simple python libraries like pandas and matplotlib.