Skip to content
View Mohit-Singh-261097's full-sized avatar
🎯
Focusing
🎯
Focusing

Block or report Mohit-Singh-261097

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Mohit-Singh-261097/README.md

Mohit Singh

Data Analyst · Nagpur, India

MSc Data Science & Computational Intelligence — Coventry University, UK · B.E. Computer Science

Three years in the UK — including a team lead role at NEC Birmingham — before returning to India to do analyst work full-time. I'm drawn to the question behind the question: not which customers are unhappy, but why the gap between expectation and delivery exists in the first place.


Projects

PostgreSQL BigQuery Power BI DAX

Diagnostic analysis of a synthetic ₹784 Cr NBFC retail portfolio across 9,800 loans. The counterintuitive finding: Good-credit salaried borrowers defaulted at the highest rate (14.01%), driven by LTI approvals at 10.20x that the credit score was masking. Recovery collapses from 35% to 3.3% between DPD 30–60 — identified DPD 60 as the true escalation threshold, 30 days earlier than industry norm.

Metric Value
Portfolio value ₹784 Cr
Loans analysed 9,800
Collection gap identified ₹50.2 Cr
Peak write-off rate 14.01% (Good-credit segment)
Recovery cliff DPD 60 → drops to 3.3%

Python Scikit-learn Power BI

Segmented 93K+ Brazilian e-commerce customers using K-Means (k=4), validated against Hierarchical and DBSCAN. The key output: one segment's 1.6 avg review score traced directly to 19-day shipping delays. That's a logistics problem dressed up as a satisfaction problem.

Metric Value
Customers segmented 93,000+
Segments 4
Risk segment avg review 1.6 ★
Avg shipping delay (risk group) 19 days

Python Scikit-learn

Absenteeism prediction pipeline on 20K employee records — 85% classification accuracy across 8 models, R² of 0.51 on hours absent. The model correctly flagged 78% of employees who went on to take 20+ absence days. HR gets an early-warning list, not a post-hoc report.

Metric Value
Records 20,000
Classification accuracy 85%
Regression R² 0.51
High-risk employees flagged 78%

Stack

Python Pandas NumPy Scikit-learn SQL PostgreSQL Power BI DAX Tableau Excel Git


Get in touch

[email protected] · LinkedIn

Pinned Loading

  1. Customer-Segmentation-Olist Customer-Segmentation-Olist Public

    Customer segmentation on 93K+ Brazilian e-commerce customers using K-Means, Hierarchical, and DBSCAN clustering — identifying 4 actionable segments with targeted business strategies

    Jupyter Notebook

  2. employee-absenteeism-analysis employee-absenteeism-analysis Public

    End-to-end ML project predicting employee absenteeism — regression, classification, and churn prediction across 20K records using Random Forest, XGBoost, and LightGBM

    Jupyter Notebook

  3. Loan-Portfolio-Diagnostics- Loan-Portfolio-Diagnostics- Public

    End-to-end credit risk & collections diagnostic on a synthetic NBFC loan portfolio — PostgreSQL · BigQuery (GCP) · Power BI · DAX | 9,800 loans · ₹784 Cr AUM · delinquency analysis · geographic con…

    Python