By Viraj Nayak
An overview of different undersampling and oversampling methods in the imbalance-learn library for handling imbalanced data.
Since most machine learning algorithms assume balanced distributions, imbalanced datasets pose a challenge. If a class has fewer samples, these samples are most often tend to be misclassified. Balance Ratio is the ratio of the number of observations of the minority class to that of the majority class. The balance between classes can be achieved either by Under- or Over Sampling.
In this project, we have an overview of different under- and oversampling techniques and their effect on the data. We also then implement them on two classification datasets to compare their effect on the model performance.
Jupyter Notebooks in the project:
To install imbalanced-learning: pip install imbalanced-learn
Submitted by Viraj Nayak (nayakviraj21)
Download packets of source code on Coders Packet
Comments