This is a group research project for COMP 562: Machine Learning at UNC Chapel Hill.
Members: @kipprr, @tdwatson, @tasnias, @brennora
Topic: Disaster Tweets
Dataset: NLP Disaster Tweets (from Kaggle)
Description: Given a tweet, decide whether or not it is talking about an actual distaster. Possible use cases for this are for news agencies who want to keep up on developments by using Twitter as a data source
Methods used:
- Clean and preprocess data using CountVectorizer (bag of words) and TF-IDF
- Logistic Regression Model a) Evaluate precision, recall, accuracy, F1 score, confusion matrix, ROC curve, AUC score
- SVM Models a) Evaluate precision, recall, accuracy, F1 score, confusion matrix, ROC curve, AUC score