Predicting Black Friday purchase vs. return outcomes to reduce return-related costs for retail inventory optimization.
Black Friday generates massive transaction volumes but also high product return rates. Returns cost retailers significantly through shipping, restocking, and lost revenue. This project uses machine learning to predict whether transactions will result in purchases (P) or returns (R), enabling:
- Reduced return-related costs
- Optimized inventory stocking
- Data-driven product recommendations
| Metric | Value |
|---|---|
| Purchase Precision | 78% |
| Return Recall | 58% |
| Accuracy | 67.7% |
| Projected ROI | 227% (~$590K) |
| Scenario | Revenue |
|---|---|
| Baseline (50% accuracy) | $1.89M |
| With Model (78% precision) | $2.94M |
| Net Gain | $1.05M |
ROI figures are projected estimates based on model performance assumptions.
- Data Extraction — Queried 160M+ POS records, filtered to 220K Black Friday transactions
- Feature Engineering — Joined SKU metadata; one-hot encoded categorical features (color, style, size)
- Class Imbalance — Applied SMOTE to address 96:4 purchase-to-return ratio
- Modeling — Compared Logistic Regression, Random Forest, and K-means + Logistic Regression ensemble
- Evaluation — Selected K-means + LR ensemble based on minority class recall and business ROI
dillards-return-prediction/
├── 01-data-cleaning/
├── 02-eda/
├── 03-feature-engineering/
├── 04-modeling/
├── 05-deliverables/
│ ├── final_report.pdf
│ ├── presentation_slides.pdf
│ └── roi_analysis.xlsx
└── requirements.txt
- Languages: Python, SQL
- ML: Scikit-learn, SMOTE (imbalanced-learn)
- Data: Pandas, NumPy
- Database: PostgreSQL
git clone https://github.com/nu-mlds-group/dillards-return-prediction.git
cd dillards-return-prediction
python -m venv .venv && source .venv/bin/activate
pip install -r requirements.txt| Document | Description |
|---|---|
| Final Report | Technical analysis (5 pages) |
| Presentation | Executive summary slides |
| ROI Analysis | Business impact calculations |
The dataset belongs to Northwestern University's MLDS program and is not publicly available. Notebooks document the methodology but require database access to execute.