By Aryan Yadav
Prediction of annual rainfall in all states of India from the year 1901-2015 from monthly rainfall by using simple multilinear regression concepts using Python.
You need to have installed numpy, pandas, scikit learn and matplotlib in your systems.
Datasets have been taken from kaggle, you can find it here.
Step 1 :
Import all the dependent libraries mentioned above for this project.
Step 2 :
Read the respective .csv file and fill all the NULL values with the mean values of that particular column. shape() function returns the number of rows and columns of the dataframe. Group all the subdivisions/states and verify the number of year entries present for a particular subdivision/states using size() function.
Step 3 :
Plotting various kinds of plots -
1) Scatter Plot of annual and january attributes.
2) Box plot of annual rainfall data in years 1901-2015.
3) Histogram of annual rainfall of all states.
Step 4 :
Consider a new dataframe in which only single months are stored and rest is dropped. And we can find the maximum rainfall and minimum rainfall in 12 months.
Step 5 :
Now, we train the model using train_test_split and test it using various performance measures like mean squared error, root mean squared error, r2_score, and also plot the scatter plot of expected vs predicted values.
This model fits good to our data as mean squared error, root mean squared error are relatively less values and scatter plot is pretty concentrated as a straight line.
Submitted by Aryan Yadav (aryanyadav)
Download packets of source code on Coders Packet
Comments