A Machine Learning model that predicts the product category using the product description.
The aim of this project is to predict the category of a product using the product description.
Programming language used: Python
Development environment used: Jupyter Notebook
Python libraries used
1. Pandas: pandas is a powerful and flexible open-source library for data analysis and manipulation. It offers data structures: DataFrames and Series that represent data in a tabular form, and also provides methods to analyze, filter, and manipulate the data. It is imported with the standard alias pd.
2. Matplotlib: Matplotlib is a powerful data visualization library that allows us to draw graphs
and charts from the data in a pandas DataFrame. The pyplot module of the library is the most useful library for the plotting of graphs and is imported with the standard alias plt.
3. Seaborn: seaborn is a data visualization tool based on matplotlib. It provides high-level interface for drawing attractive and informative graphs. It is imported with the alias sns.
4. Scikit-learn (sklearn): Scikit-learn is a powerful library that is commonly used to build machine learning models. It supports classification, regression, and clustering algorithms. It also provides tools for data preprocessing and evaluation metrics for a machine learning model.
The dataset used contains 20000 rows and 15 columns. However, only one feature (product description) out of the 15 features has been used to predict the target variable (product category).
The zipped folder contains the following:
1. The source file (A jupyter notebook; to open the source file, you need to have Jupyter Notebook installed in your computer)
2. A PDF file containing a detailed description of the project
3. The dataset (An excel file)
Submitted by Sinjini Ghosh (Sinjini)
Download packets of source code on Coders Packet