Coders Packet

Content-Based Recommendation-System in Python Using BOW and TF-IDF


A content-based Recommendation System using the BOW(Bag Of Words) Model and TF-IDF model. Dataset - Amazon's women Apparel Product in Python.

Libraries and Modules Used :

       1. sci-kit learn

       2. NLTK

       3. Seaborn

       4. NumPy

       5. Pandas

       6. Matplotlib 

Dataset Description :

        Amazon's Women Apparel Product Dataset(180k Data Points) from Amazon associates Products API.

        Dataset: The dataset can be found from Amazon Associates API. 


0. Perform Exploratory Data Analysis to understand the dataset better.

1. Data Preprocessing - Removal of rows with Null Values, duplicate, and near-duplicate Items.

2. Text Preprocessing - Filter out stopwords and special Characters in the 'Titles' of each product.

3. Extraction of the numerical features from the 'titles' of each product in the form of vectors.

4. Compute the similarity between the title vectors using criteria such as euclidian distance, cosine similarity, etc

5. Recommend 'n' numbers of products having the most similarity with the product, using BOW(Bag Of Words ) and TF-IDF models

Observation: TF-IDF Model provides better recommendation results than BOW Model.

Download Complete Code


No comments yet