Category Prediction Through Description using BERT MODEL

NLP

Details- Use a given dataset to build a model to predict the category using description. Write code in python. Using Jupyter notebook is encouraged.

Show how you would clean and process the data
Show how you would visualize this data
Show how you would measure the accuracy of the model
What ideas do you have to improve the accuracy of the model? What other algorithms would you try?

About Data :

You have to clean this data, In the product category tree separate all the categories, figure out the primary category, and then use the model to predict this. If you want to remove some categories for lack of data, you are also free to do that.

Note:

Goal is to predict the product category.
Description should be the main feature. Feel free to use other features if it'd improve the model.
Include a Readme.pdf file with approach in detail and report the accuracy and what models were used.

Product Classifier

This project is about multi class classification using NLP. Here we are provided with flipkart dataset from which we have to predict the primary category using product description.

Installation

In order to reproduce the results produced by the notebook the following needs to be installed. Use of virtual environment while installing these libraries is preferable.

pip install -q tensorflow-text
pip install -q tf-models-official
pip install wordcloud
pip install gensim
pip install nltk
pip install spacy
pip install transformers
pip install wget
pip install transformers
pip install wget
pip install pandas
pip install numpy
pip install seaborn
pip install matplotlib
pip install torch

Approach

1) Data For Analysis

From the product_category_tree we consider the primary catergory as the root of this tree. 
Example: For the category given tree '["Footwear >> Women's Footwear >> Ballerinas >> AW Bellies"]' the primary category is Footwear

2) Exploratory Data Analysis

Visulazation of Dataset is being done under this section

3) Data Cleaning

Data is being cleaned before going through Data Preparation

4) Data Preparation

Data preprocessing is done in the following fashion.
Tokenization of the description of the product. (Post cleaning the description for the product).
Lemmatizing the tokenized data in-order to prepare it for usagein the model.

5) Model Preparation and Training

Here we are removing all the product categories for which less than 10 products are present.

6) Result

So,after detailed analysis I have used BERT(Bidirectional Encoder Representations from Transformers) Model. It is designed to pre-train deep bidirectional representations from unlabeled text by jointly conditioning on both left and right context.

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
README.md		README.md
classification_of_product_final.ipynb		classification_of_product_final.ipynb
flipkart_data.csv.zip		flipkart_data.csv.zip

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Category Prediction Through Description using BERT MODEL

NLP

About Data :

Note:

Product Classifier

Installation

Approach

1) Data For Analysis

2) Exploratory Data Analysis

3) Data Cleaning

4) Data Preparation

5) Model Preparation and Training

6) Result

F1-Score

Training loss: 0.018742394280544406

Validation loss: 0.15305931800355793

F1 Score (Weighted): 0.9792079018736954

Accuracy

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Category Prediction Through Description using BERT MODEL

NLP

About Data :

Note:

Product Classifier

Installation

Approach

1) Data For Analysis

2) Exploratory Data Analysis

3) Data Cleaning

4) Data Preparation

5) Model Preparation and Training

6) Result

F1-Score

Training loss: 0.018742394280544406

Validation loss: 0.15305931800355793

F1 Score (Weighted): 0.9792079018736954

Accuracy

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages