Time Series Analysis with TensorFlow

Hey there! Ready to dive into the exciting world of time series analysis? In this tutorial, we will learn how to use TensorFlow, Google’s open-source library for machine learning, to analyze and predict time series data.

We will work with a dataset of daily minimum temperatures in Melbourne and walk through each step. By the end of this tutorial, you’ll have built a powerful predictive model using Tensorflow.

Building a Time Series Analysis Model with Tensorflow

Step 1: Setting the Stage

First things first, we need to set up our environment and load the necessary libraries. We’ll be using Pandas for handling data, matplotlib for plotting, and Tensorflow for building our model.

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, LSTM

# Load the dataset
data = pd.read_csv("daily-min-temperatures.csv")
data.head()

Output:

  Date       Temp
0 1981-01-01 20.7
1 1981-01-02 17.9
2 1981-01-03 18.8
3 1981-01-04 14.6
4 1981-01-05 15.8

 

Step 2: Exploring the Data

Let’s take a look at our data. It’s a dataset of daily minimum temperatures in Melbourne. We’ll use Pandas to inspect the first few rows and understand its structure.

data['Date'] = pd.to_datetime(data['Date'])
data.set_index('Date', inplace=True)
print(data.head())

# Plotting the data
plt.figure(figsize=(10, 6))
plt.plot(data, label='Daily Minimum Temperatures')
plt.xlabel('Date')
plt.ylabel('Temperature')
plt.title('Daily Minimum Temperatures in Melbourne')
plt.legend()
plt.show()

Output:

Date      Temp
1981-01-01 20.7
1981-01-02 17.9
1981-01-03 18.8
1981-01-04 14.6
1981-01-05 15.8

1981-01-01 20.7
1981-01-02 17.9
1981-01-03 18.8
1981-01-04 14.6
1981-01-05 15.8


Step 3: Preprocessing the data

Before diving into modeling the data, we need to prepare our data. Time series forecasting requires us to create sequences of past observations to predict future values. We will create sequences and corresponding labels.

def create_sequences(data, window_size):
    sequences = []
    labels = []
    for i in range(len(data) - window_size):
        sequences.append(data[i:i + window_size])
        labels.append(data[i + window_size])
    return np.array(sequences), np.array(labels)

window_size = 30
data_values = data['Temp'].values
sequences, labels = create_sequences(data_values, window_size)

Output:

Sequences shape: (3613, 30)
Labels shape: (3613,)

Step 4: Splitting the Data

We need to split our data into training and testing sets. This will help us evaluate how well our model performs on unseen data.

split_ratio = 0.8
split_index = int(len(sequences) * split_ratio)

x_train, x_test = sequences[:split_index], sequences[split_index:]
y_train, y_test = labels[:split_index], labels[split_index:]

print(f"Training set size: {x_train.shape[0]}")
print(f"Test set size: {x_test.shape[0]}")

Output:

Training set size: 2890
Test set size: 723

Step 5: Splitting the Data

Now, it’s time to build our model. We’ll use an LSTM(Long Short-Term Memory) network, which is great for time series data because it can capture long-term dependencies.

model = Sequential()
model.add(LSTM(50, return_sequences=True, input_shape=(window_size, 1)))
model.add(LSTM(50))
model.add(Dense(1))
model.compile(optimizer='adam', loss='mean_squared_error')
model.summary()

Output:

Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param # 
=================================================================
lstm (LSTM) (None, 30, 50) 10400 
_________________________________________________________________
lstm_1 (LSTM) (None, 50) 20200 
_________________________________________________________________
dense (Dense) (None, 1) 51 
=================================================================
Total params: 30,651
Trainable params: 30,651
Non-trainable params: 0
_________________________________________________________________


Step 6: Training the Model

We’re ready to train our model! We will convert our data into the right format and then fit the model.

x_train = np.expand_dims(x_train, axis=2)
x_test = np.expand_dims(x_test, axis=2)

history = model.fit(x_train, y_train, epochs=50, batch_size=32, validation_split=0.2)

Output:

Epoch 1/50
73/73 [==============================] - 5s 25ms/step - loss: 23.7394 - val_loss: 6.6627
Epoch 2/50
73/73 [==============================] - 1s 16ms/step - loss: 5.3046 - val_loss: 3.9453
...
Epoch 50/50
73/73 [==============================] - 1s 16ms/step - loss: 1.6107 - val_loss: 1.8825

Step 7: Evaluating the Model with metrics

Once the model is trained, we need to see how well it performs. We will use statistical metrics to evaluate our model’s performance.

from sklearn.metrics import mean_absolute_error, mean_squared_error
import numpy as np

# Make predictions
predictions = model.predict(x_test)

# Flatten the predictions array to match the shape of y_test
predictions = predictions.flatten()

# Calculate MAE
mae = mean_absolute_error(y_test, predictions)
print(f'Mean Absolute Error (MAE): {mae:.4f}')

# Calculate MSE
mse = mean_squared_error(y_test, predictions)
print(f'Mean Squared Error (MSE): {mse:.4f}')

# Calculate RMSE
rmse = np.sqrt(mse)
print(f'Root Mean Squared Error (RMSE): {rmse:.4f}')

Output:

Mean Absolute Error (MAE): 1.1215
Mean Squared Error (MSE): 2.0365
Root Mean Squared Error (RMSE): 1.4279

These metrics will provide you with numerical insights into how well your model is performing:

  • MAE: On average, the model’s predictions are off by about 1.12 degrees.
  • MSE: The average of the squared errors is about 2.04.
  • RMSE: The average error is about 1.43 degrees, in the same units as the data.
Step 8: Fine-Tuning and Improvements

If your model is not performing as well as you’d like, don’t worry! It’s common to go back and tweak your model. You can try different window sizes, more epochs, or even different network architectures. Experimenting with hyperparameters and additional layers can significantly improve your model’s performance.

Conclusion

We’ve walked through the entire process of time series analysis using TensorFlow, from loading and exploring the data to building and evaluating a model. Time series forecasting is a power tool, and with TensorFlow, it’s easier than ever to get started. Keep experimenting and improving your models, and soon you’ll be a time series forecasting pro.
Happy coding!

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top