This project is a spam classification system that uses Multinomial Naive Bayes algorithm. The project involves pre-processing an adaptive dataset of emails using Natural Language Processing techniques such as tokenization, stop-word removal, and stemming. The pre-processed data is then used to train a logistic regression model. The model is based on bag-of-words approach, where each email is represented as a vector of word frequencies. The project uses Python programming language and its libraries including pandas, scikit-learn, and matplotlib for data processing, modeling, and visualization respectively. The evaluation of model accuracy is based on precision, recall, and F1-score metrics. The project includes data visualization of the most frequently used words in spam and ham emails.
AvichalS/Email-Spam-Filtering
Folders and files
| Name | Name | Last commit date | ||
|---|---|---|---|---|