Feature Selection using scikit-learn, Feature-engine and Mlxtend in Python
By Viraj Nayak
An overview of different feature selection methods in Sklearn, Feature-engine and Mlxtend libraries.
Feature Selection is the process of selecting features that are significant to make predictions. By using feature selection we can reduce the complexity of our model, make it faster and computationally less expensive. And if the right features are selected it can improve the accuracy of the model. The following feature selection methods have been discussed in the project:
- Filter Methods: Constant and Quasi-Constant Features, Duplicated Features, Correlated Features, Single Feature Model Performance, Mutual Information, Chi-square and Anova.
- Wrapper Methods: Step Forward Feature Selection, Step Backward Feature Selection and Exhaustive Feature Selection.
- Embedded Methods: Linear, Lasso and Tree.
Jupyter Notebooks in the project:
- Feature Selection Methods: This file contains an overview of all the mentioned methods and their implementation on a custom classification dataset.
- Implementation on House Price Dataset: This file contains the implementation of different methods on the House Price Dataset and a comparison of scores and time taken by each method.
- Installing Feature-engine: pip install feature-engine
- Installing Mlxtend: pip install mlxtend