Coders Packet

Temperature forecasting in Python using Linear Regression

By Sudipta Ghosh

The project aims to describe the relationship between inside and outside temperature (y(inside)=mx(outside)+c) in Python using Linear Regression.

In this tutorial, we will learn how we can describe the relationship between inside and outside temperature in Python using Linear Regression. The steps are as follows:
step 1:
we will import the necessary libraries:-

import numpy as np 
import pandas as pd 

from datetime import datetime
from sklearn.linear_model import LinearRegression

import matplotlib
import matplotlib.pyplot as plt
%matplotlib inline
import seaborn as sns

step 2:
We will import the dataset of temperature forecasting:-

# import Dataset
temp = pd.read_csv('IOT-temp.csv', parse_dates=['noted_date'])
temp.head()

Here, head()- It is used to see the sample row of the dataset.

step 3:
We will remove useless columns and reshape the columns of the dataset.

##REMOVING USELESS COLUMNS
temp['room_id/id'].value_counts()

# dropping columns
cols_drop = ['id', 'room_id/id']
temp = temp.drop(cols_drop, axis=1)

##TABLE RESHAPE
print("the dataset has shape = {}".format(temp.shape))

step 4:
Here we will round the time of measurement up to hours. After we will get information about the mean temperature for each measure hour for inside and outside. It will allow us to know about the relation between inside and outside at the moment.
so, we will build new features for the timestamp.

temp['measure_hour'] = temp.noted_date.apply(lambda x:datetime.strftime(x,'%Y-%m-%d %H:00:00'))
temp_data = temp.groupby(['measure_hour','out/in']).temp.mean().reset_index()
temp_data = temp_data.pivot(index = 'measure_hour',columns = 'out/in', values = 'temp').reset_index().dropna()
temp_data.head()

step 5:

The next step is to visualize the data using an SNS plot:

##VISUALIZATION
fig, ax = plt.subplots(figsize = (6,4))
g = sns.distplot(temp_data.In, label = 'In')
g = sns.distplot(temp_data.Out, label = 'Out')
plt.legend()
g.set_xlabel('Temperature')

Now we will plot the relation of inside and outside temperature. We can guess that people turn on the conditioner if the outside temperature is higher than 35. So, can say, that the inside temperature is in the interval [28;35] if the outside temperature is higher than 35. Let's take a look at the piece of data with the outside temperature above 35.

sns.scatterplot(x =temp_data.Out, y = temp_data.In)

step 6:
Linear Regression model building:-

after removing from set values with the outside temperature above 35, we can see linear dependence. And we can use linear regression to predict inside temperature with known outside value.

sns.scatterplot(x =temp_data[data.Out<=35].Out, y = temp_data[data.Out<=35].In)

We will remove some outliers here in the below code:

linear = temp_data[(pd.notna(temp_data.Out))&(pd.notna(temp_data.In)) & (temp_data.Out<35)]

linear = linear.drop(index = linear[((linear.Out>32)&(linear.In<30)|(linear.Out<25))].index, axis = 0)
sns.scatterplot(x = linear.Out, y = linear.In)

Linear model for the dataset:-

model = LinearRegression()
model.fit(linear[['Out']],linear.In)

l, = model.coef_[0],model.intercept_
print(k,b)
sns.scatterplot(x = linear.Out, y = linear.In)
reg_line = np.linspace(25,35,100)
plt.plot(reg_line, reg_line*k + b)

So, after these steps, we can successfully implement the relationship of inside and outside temperature. I would describe the relationship between inside and outside temperatures like this:
Inside(Outside)=0.89∗Outside+3, if Outside∈[25,35],Inside(Outside)∈[28;35],if Outside∈(35;∞)

 

 

 

Download project

Reviews Report

Submitted by Sudipta Ghosh (Sudipta)

Download packets of source code on Coders Packet