A machine learning-based project that predicts stock price movements by combining traditional financial indicators with sentiment analysis derived from social media (Twitter) data. Built as a part of the Design Project at IIIT Vadodara.
Stock prices are influenced not only by financial indicators but also by public sentiment and news. In today's world, social media is a powerful source of sentiment, where the crowd’s opinion can affect market trends.
This project aims to predict stock price behavior using:
- Historical stock data (open, high, low, close, volume)
- Sentiment features derived from Twitter
- Statistical and machine learning models
- Predict stock price trends using machine learning models.
- Incorporate Twitter sentiment as a predictive feature.
- Compare performance with and without sentiment data.
- Improve on past models with better accuracy and generalization.
- Prior research shows sentiment analysis improves stock price prediction accuracy.
- Earlier works reported accuracies of ~70–80% using classical models.
- Our goal: surpass those benchmarks using combined feature engineering and optimized learning models.
- Collected daily stock features:
Date,Open,High,Low,Close, etc. - Trained models like:
- Linear Regression → Error ≈ 1.325
- Random Forest → Error ≈ 0.868
- Collected tweets using Tweepy API (past 24 hours).
- Preprocessed tweets for noise reduction.
- Used WordNet and SentiWordNet for sentiment scoring.
- Extracted features:
Positivity ScoreNegativity Score
- Used a variation of Word2Vec to embed tweets.
- Generated weighted averages of tweet vectors across time spans.
- Tried various models; best results with:
- LSTM (Long Short-Term Memory) neural network
- Configuration:
- Activation: tanh
- Optimizer: RMSprop
- Epochs: 500
- Average model accuracy: ~90%
- LSTM outperformed traditional regression-based models.
- Models trained with sentiment features performed significantly better than those without.
- Public sentiment on platforms like Twitter plays a key role in stock price behavior.
- Combining financial and sentiment data improves prediction accuracy.
- Our model was able to reasonably predict stock behavior, if not exact prices.
- Train on longer historical windows (more than one day).
- Expand to multiple companies across sectors.
- Deploy as a live application or alert system.
- Use real-time sentiment streams for intraday predictions.
- Vikash Choudhary (201851144)
- Himanshu Bhadu (201851048)
- Ritik Rawat (201851102)
- Noorul Hasan Ali (201851078)
Indian Institute of Information Technology Vadodara
Government Engineering College, Sector-28, Gandhinagar, Gujarat - 382028
Dr. S.K. Patra
IIIT Vadodara
- Illustrated Guide to LSTM’s and GRU’s – Michael Phi (Towards Data Science)
- Stock Market Prediction Using Twitter Sentiment Analysis – Stanford
- Machine Learning for Stock Prediction – Towards Data Science
- Tweepy Documentation – Python Twitter API
- SentiWordNet
📂 For more details, refer to the full project report:
DesignProject-Report.pdf