Coders Packet

Predicting the Optimum Value of Clusters in K Means Algorithm in Python.

By Paras Rawat

Predicting the Optimum Value of Clusters in K Means Algorithm with the help of the Elbow method in Python.

K Means Algorithm is used for unsupervised learning. It attempts to group similar clusters together in the data. The best way of predicting the no of clusters to use is the Elbow method.



1) i) Import pandas, NumPy, and matplotlib

   ii) Import datasets from sklearn.


2) Read the  CSV dataset into iris using the following command:


3) Analysis your data and plot some graphs to understand the dataset better.

4) Get the values of your dataset in the variable X using '.values'.

5) From sklearn.cluster import KMeans and create a list object wcss.

6) Create a for loop i in range(1,11)

7) Inside the loop create a model kmeans and set the parameter n_clusters equal to i and fit X to the model.

8) Now append kmeans.inertia_ to the list wcss. Inertia is the sum of squared error for each cluster.

9) Then plot the values in the list wcss. 

10) The optimum clusters are where the elbow occurs. This is when the within-cluster sum of squares (wcss) doesn't decrease significantly with every iteration.

11) Now train your model with the optimum number of clusters and make predictions using the model