Stacking in Machine Learning

Introduction-

Stacking is one of the very popular technique used in the Machine Learning. There are various learning models involved in the Machine Learning, like- Support Vector Machines (SVMs), Decision Trees, Artificial Neural Networks (ANNs), etc.

At the first place we use these models individually wherever these are suitable and we train these models and then we use them.

In stacking (as the name suggests), we combine these multiple models to improve predictive performance. Stacking is hence also known as- “Stacked Generalization”.

How Stacking Works?

Stacking typically works into following steps-

https://drive.google.com/file/d/15TFtIZ9CXJYLs-OQH-ZpA_ezNOXtLesU/view?usp=drive_link

a. Base Model/Initial Model-

Initially we start from training several base models on the dataset., which will perform forehand. The base models can be of different types like- Neural Networks, Regression Models, Decision trees, etc. and these are trained with different subset of data.

b. Meta Learner/Stacked Model-

After we have collected the predictions generated by the base models, then the combination of these predictions are done by using the Meta Learner.

A Machine Learning model works by taking a set of input datasets, and then the machine learns by documenting the output containing in these input datasets.

The meta learner takes the predictions generated by these base models as an input feature and then learns to make the final predictions.

Sample Code-

import numpy as np
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn.tree import DecisionTreeClassifier
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score

# Load the Iris dataset
iris = load_iris()
X, y = iris.data, iris.target

# Print information about the Iris dataset
print("Iris dataset information:")
print(iris.DESCR)
print()

# Split the dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Base models
base_models = [
    RandomForestClassifier(n_estimators=100, random_state=42),
    DecisionTreeClassifier(max_depth=3, random_state=42)  # Adjusted max_depth for decision tree
]

# Initialize empty arrays to store predictions of base models on training and testing data
train_pred_base = np.zeros((X_train.shape[0], len(base_models)))
test_pred_base = np.zeros((X_test.shape[0], len(base_models)))

# Train base models and make predictions
for i, model in enumerate(base_models):
    model.fit(X_train, y_train)
    train_pred_base[:, i] = model.predict(X_train)
    test_pred_base[:, i] = model.predict(X_test)

# Meta-learner (Logistic Regression)
meta_learner = LogisticRegression()

# Train meta-learner using predictions of base models
meta_learner.fit(train_pred_base, y_train)

# Make predictions on the test set using base models and meta-learner
test_pred_stacked = meta_learner.predict(test_pred_base)

# Evaluate the performance of stacked model
accuracy_stacked = accuracy_score(y_test, test_pred_stacked)
print("Accuracy of stacked model:", accuracy_stacked)

# Compare with individual base models
accuracy_base = []
for i, model in enumerate(base_models):
    test_pred_base_individual = model.predict(X_test)
    accuracy_individual = accuracy_score(y_test, test_pred_base_individual)
    print(f"Accuracy of base model {i+1}: {accuracy_individual}")
    accuracy_base.append(accuracy_individual)

# Print the difference in accuracy between stacked model and individual base models
for i, acc in enumerate(accuracy_base):
    print(f"Difference with base model {i+1}: {accuracy_stacked - acc}")

Sample Output-

https://drive.google.com/file/d/1LgQiT-VzSlN4r1FslouAkxFZkhYM9sXQ/view?usp=drive_link

Summary-

The main idea behind stacking is that we try to approach a learning problem (which cannot be solved by a single model) by dividing it into smaller sub-parts that can be solved by different models.

The parts of the problem can be solved by using different models, and then at the end combing the results (predictions) of these models and create the final prediction, which is the solution of the problem.

By repeating this process, the final model is created by “stacking” on the top of the other base models. Hence, we aim to create a more refined solution for the problem, this increases the performance and is better than solving the problem without using Stacking.

Related Posts

Leave a Comment Cancel Reply