Skip to content

nu-mlds-group/dillards-return-prediction

Repository files navigation

Dillard's Black Friday Return Prediction

Python scikit-learn License

Predicting Black Friday purchase vs. return outcomes to reduce return-related costs for retail inventory optimization.

Problem

Black Friday generates massive transaction volumes but also high product return rates. Returns cost retailers significantly through shipping, restocking, and lost revenue. This project uses machine learning to predict whether transactions will result in purchases (P) or returns (R), enabling:

  • Reduced return-related costs
  • Optimized inventory stocking
  • Data-driven product recommendations

Results

Metric Value
Purchase Precision 78%
Return Recall 58%
Accuracy 67.7%
Projected ROI 227% (~$590K)

Revenue Impact

Scenario Revenue
Baseline (50% accuracy) $1.89M
With Model (78% precision) $2.94M
Net Gain $1.05M

ROI figures are projected estimates based on model performance assumptions.

Approach

  1. Data Extraction — Queried 160M+ POS records, filtered to 220K Black Friday transactions
  2. Feature Engineering — Joined SKU metadata; one-hot encoded categorical features (color, style, size)
  3. Class Imbalance — Applied SMOTE to address 96:4 purchase-to-return ratio
  4. Modeling — Compared Logistic Regression, Random Forest, and K-means + Logistic Regression ensemble
  5. Evaluation — Selected K-means + LR ensemble based on minority class recall and business ROI

Project Structure

dillards-return-prediction/
├── 01-data-cleaning/
├── 02-eda/
├── 03-feature-engineering/
├── 04-modeling/
├── 05-deliverables/
│   ├── final_report.pdf
│   ├── presentation_slides.pdf
│   └── roi_analysis.xlsx
└── requirements.txt

Tech Stack

  • Languages: Python, SQL
  • ML: Scikit-learn, SMOTE (imbalanced-learn)
  • Data: Pandas, NumPy
  • Database: PostgreSQL

Quick Start

git clone https://github.com/nu-mlds-group/dillards-return-prediction.git
cd dillards-return-prediction
python -m venv .venv && source .venv/bin/activate
pip install -r requirements.txt

Deliverables

Document Description
Final Report Technical analysis (5 pages)
Presentation Executive summary slides
ROI Analysis Business impact calculations

Data Note

The dataset belongs to Northwestern University's MLDS program and is not publicly available. Notebooks document the methodology but require database access to execute.

License

MIT

About

ML pipeline predicting Black Friday purchase vs. return outcomes using K-means + Logistic Regression ensemble. 78% purchase precision, 58% return recall, 227% projected ROI.

Topics

Resources

License

Stars

Watchers

Forks

Contributors