Calculate Feature importance with Python

In this tutorial we will know about feature. It is a step in building a machine learning model to calculate for input features and decision making.

we will be calculate feature importance using various techniques, depending on the type of data you have and the model what you were using. In this we have tree based model like random forest, gradient boosting and many other. let us see for random forest using boston housing data set.

  1. Import the required Libraries and Data set.
import pandas as pd
from sklearn.datasets import load_boston
from sklearn.ensemble import RandomForestRegressor

2. The next isĀ  load the boston data set.

boston = load_boston()
x = pd.DataFrame(boston.data, columns=boston.feature_names)
y = pd.series(boston.target, name='sdv')

3.fit a random forest model.

model=RandomForestRegressor()
model.fit(x,y)

4.we will get feature importance.

importances = model.feature_importances

5.create one dataframe to visualize the feature importance

feature_importance_df=pd.Dataframe(importances, index=x.columns,columns=['Importance'])
feature_importance_df.sort_values(by='Importance',ascending=False, inplace=True)
print(feature_importance_df)

This code will print out the importance of each feature in predicting the target variable, which in this case you can adjust the model and explore other algorithms to see if they provide different insights into feature importance.

 

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top