A simple rainfall prediction model using a dataset obtained from Kaggle. Follow the code for more information.
This is a rainfall prediction model using Python. The data belongs to Australia and it consists of rainfall data from the past 10 years.
The dataset contains rainfall patterns of many locations in Australia. Some of the columns present in the dataset are Date, Location, Maximum Temperature, Minimum Temperature, Wind Direction, Wind Speed, Humidity, Pressure, etc.
Using the input data, we have to predict if there will be rainfall tomorrow or not. Classification algorithms like Logistic Regression, Naive Bayes, KNN, Decision Trees, and Random Forest were used.
1. We start our process with data acquisitions as we discussed we have to take the input data and feed it into the machine. Thereby giving the first option for the machine to understand what are the various attributes and how many attributes are present.
2. Division of our attributes into categorical and numerical. Null values have to be removed or replaced(with either mean values or median values).
3. Our next step is to the encoding of String data into numerical values. The input labels are encoded using the One Hot Encoding method whereas the output label is encoded using Label Encoder.
4. Highly correlated columns are removed, as we get the same information from multiple labels. The following step is to detect and eliminate outliers.
5. Next step is to divide the dataset into training and testing sets.
6. Final step involves training the dataset on multiple classification algorithms and evaluating the results.
Hope this project helped you further your learning of Machine Learning. You can also follow your own techniques and code to obtain similar/better results.
Submitted by G V Ganesh Maurya (mau23rya)
Download packets of source code on Coders Packet