MSc Data Science & Computational Intelligence — Coventry University, UK · B.E. Computer Science
Three years in the UK — including a team lead role at NEC Birmingham — before returning to India to do analyst work full-time. I'm drawn to the question behind the question: not which customers are unhappy, but why the gap between expectation and delivery exists in the first place.
Diagnostic analysis of a synthetic ₹784 Cr NBFC retail portfolio across 9,800 loans. The counterintuitive finding: Good-credit salaried borrowers defaulted at the highest rate (14.01%), driven by LTI approvals at 10.20x that the credit score was masking. Recovery collapses from 35% to 3.3% between DPD 30–60 — identified DPD 60 as the true escalation threshold, 30 days earlier than industry norm.
| Metric | Value |
|---|---|
| Portfolio value | ₹784 Cr |
| Loans analysed | 9,800 |
| Collection gap identified | ₹50.2 Cr |
| Peak write-off rate | 14.01% (Good-credit segment) |
| Recovery cliff | DPD 60 → drops to 3.3% |
Segmented 93K+ Brazilian e-commerce customers using K-Means (k=4), validated against Hierarchical and DBSCAN. The key output: one segment's 1.6 avg review score traced directly to 19-day shipping delays. That's a logistics problem dressed up as a satisfaction problem.
| Metric | Value |
|---|---|
| Customers segmented | 93,000+ |
| Segments | 4 |
| Risk segment avg review | 1.6 ★ |
| Avg shipping delay (risk group) | 19 days |
Absenteeism prediction pipeline on 20K employee records — 85% classification accuracy across 8 models, R² of 0.51 on hours absent. The model correctly flagged 78% of employees who went on to take 20+ absence days. HR gets an early-warning list, not a post-hoc report.
| Metric | Value |
|---|---|
| Records | 20,000 |
| Classification accuracy | 85% |
| Regression R² | 0.51 |
| High-risk employees flagged | 78% |
Python Pandas NumPy Scikit-learn SQL PostgreSQL Power BI DAX Tableau Excel Git