- Introduction-
Stacking is one of the very popular technique used in the Machine Learning. There are various learning models involved in the Machine Learning, like- Support Vector Machines (SVMs), Decision Trees, Artificial Neural Networks (ANNs), etc.
At the first place we use these models individually wherever these are suitable and we train these models and then we use them.
In stacking (as the name suggests), we combine these multiple models to improve predictive performance. Stacking is hence also known as- “Stacked Generalization”.
- How Stacking Works?
Stacking typically works into following steps-
https://drive.google.com/file/d/15TFtIZ9CXJYLs-OQH-ZpA_ezNOXtLesU/view?usp=drive_link
a. Base Model/Initial Model-
Initially we start from training several base models on the dataset., which will perform forehand. The base models can be of different types like- Neural Networks, Regression Models, Decision trees, etc. and these are trained with different subset of data.
b. Meta Learner/Stacked Model-
After we have collected the predictions generated by the base models, then the combination of these predictions are done by using the Meta Learner.
A Machine Learning model works by taking a set of input datasets, and then the machine learns by documenting the output containing in these input datasets.
The meta learner takes the predictions generated by these base models as an input feature and then learns to make the final predictions.
Sample Code-
import numpy as np from sklearn.datasets import load_iris from sklearn.model_selection import train_test_split from sklearn.ensemble import RandomForestClassifier from sklearn.tree import DecisionTreeClassifier from sklearn.linear_model import LogisticRegression from sklearn.metrics import accuracy_score # Load the Iris dataset iris = load_iris() X, y = iris.data, iris.target # Print information about the Iris dataset print("Iris dataset information:") print(iris.DESCR) print() # Split the dataset into training and testing sets X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42) # Base models base_models = [ RandomForestClassifier(n_estimators=100, random_state=42), DecisionTreeClassifier(max_depth=3, random_state=42) # Adjusted max_depth for decision tree ] # Initialize empty arrays to store predictions of base models on training and testing data train_pred_base = np.zeros((X_train.shape[0], len(base_models))) test_pred_base = np.zeros((X_test.shape[0], len(base_models))) # Train base models and make predictions for i, model in enumerate(base_models): model.fit(X_train, y_train) train_pred_base[:, i] = model.predict(X_train) test_pred_base[:, i] = model.predict(X_test) # Meta-learner (Logistic Regression) meta_learner = LogisticRegression() # Train meta-learner using predictions of base models meta_learner.fit(train_pred_base, y_train) # Make predictions on the test set using base models and meta-learner test_pred_stacked = meta_learner.predict(test_pred_base) # Evaluate the performance of stacked model accuracy_stacked = accuracy_score(y_test, test_pred_stacked) print("Accuracy of stacked model:", accuracy_stacked) # Compare with individual base models accuracy_base = [] for i, model in enumerate(base_models): test_pred_base_individual = model.predict(X_test) accuracy_individual = accuracy_score(y_test, test_pred_base_individual) print(f"Accuracy of base model {i+1}: {accuracy_individual}") accuracy_base.append(accuracy_individual) # Print the difference in accuracy between stacked model and individual base models for i, acc in enumerate(accuracy_base): print(f"Difference with base model {i+1}: {accuracy_stacked - acc}")
Sample Output-
https://drive.google.com/file/d/1LgQiT-VzSlN4r1FslouAkxFZkhYM9sXQ/view?usp=drive_link
Summary-
The main idea behind stacking is that we try to approach a learning problem (which cannot be solved by a single model) by dividing it into smaller sub-parts that can be solved by different models.
The parts of the problem can be solved by using different models, and then at the end combing the results (predictions) of these models and create the final prediction, which is the solution of the problem.
By repeating this process, the final model is created by “stacking” on the top of the other base models. Hence, we aim to create a more refined solution for the problem, this increases the performance and is better than solving the problem without using Stacking.