AI-powered credit risk evaluation that classifies loan applications into four risk tiers with 95%+ recall.
Model: Logistic Regression + Optuna | Metrics: Recall 95.31%, Precision 58.55% | Deployment: FastAPI + Streamlit + MLflow
This project uses explainable machine learning to predict default probability and assign CIBIL-like credit scores (300-900) for loan risk assessment, achieving financial industry benchmarks for recall and Gini coefficient.
CredVibe is a production-ready credit risk assessment system developed for an Indian NBFC. It evaluates loan applications using 14 key features across personal, bureau, and loan data sources, delivering instant credit scores and risk ratings (Poor/Average/Good/Excellent) with full explainability for business rule integration.
- 4-Tier Risk Classification: Categorizes applicants into Poor, Average, Good, and Excellent ratings based on CIBIL-like scoring (300-900).
- High Recall Optimization: Achieves 95.31% recall on default class, ensuring 19 out of 20 high-risk defaulters are flagged for review.
- Explainable AI: Logistic regression coefficients convertible to business rules for integration with Business Rule Engine (BRE).
- Dual Interface: Access via modern Streamlit UI or REST API through FastAPI backend.
- Real-time Scoring: Instant credit score calculation with default probability and LTI (Loan-to-Income) derivation.
- Advanced Feature Engineering: IV-based selection (IV > 0.02), VIF filtering (VIF < 5), and derived features (Delinquency Ratio, Avg DPD).
- Class Imbalance Handling: SMOTE and SMOTE + Tomek Links for balanced training.
- Hyperparameter Optimization: Optuna and Bayesian Optimization for model tuning.
- Financial Validation: Rank ordering, KS Statistics (>40), Gini (>85), and Decile Analysis for model robustness.
- Experiment Tracking: MLflow integration for model versioning and experiment management.
| Metric | Champion (Optuna) | Challenger (Bayesian) | Target | Status |
|---|---|---|---|---|
| Recall | 95.31% | 93.68% | > 90% | ✅ Met |
| Precision | 58.55% | 54.74% | > 50% | ✅ Met |
| Weighted F1 | 94.21% | 93.46% | - | - |
| Macro F1 | 84.21% | 82.67% | - | - |
While ensemble methods (Random Forest, XGBoost) were evaluated, Logistic Regression was selected as the champion model due to:
- Explainability: Coefficients directly interpretable as log-odds, easily convertible to business rules for the NBFC's Business Rule Engine
- Regulatory Compliance: Transparent decision-making suitable for financial audit and regulatory requirements
- Production Stability: Consistent predictions across out-of-time validation (Mar-May 2025)
- Speed: Sub-millisecond inference time for real-time scoring
credit_risk_evaluation/
│
├── .gitignore # Git ignore file
├── LICENSE.md # License file
├── README.md # Project documentation
├── requirements.txt # Python dependencies
│
├── Artifacts/ # Screenshots and visualizations
│
├── Backend/ # FastAPI backend server
│ └── server.py # FastAPI entry point with endpoints
│
├── Core/ # Shared prediction logic
│ ├── __init__.py
│ └── prediction_helper.py # Model loading, preprocessing, scoring
│
└── Frontend/ # Streamlit frontend application
└── app.py # Interactive UI with custom CSS- Python 3.10 or higher
- Git
git clone https://github.com/inv-fourier-transform/credit_risk_evaluation.git
cd credit_risk_evaluationpython -m venv venv
# On Windows:
venv\Scripts\activate
# On macOS/Linux:
source venv/bin/activatepip install -r requirements.txtPlace the trained model file model_data2.joblib inside:
Artifacts/
Navigate to the project root and start the server:
uvicorn Backend.server:app --reload --host 0.0.0.0 --port 8000The API will be available at:
http://localhost:8000
- Open Postman and create a new request
- Set request type to POST and enter URL:
http://localhost:8000/predict_credit_risk - Go to Body tab → Select raw → Set type to JSON
- Enter JSON payload:
{
"age": 35,
"income": 1200000,
"loan_amount": 500000,
"loan_tenure_months": 12,
"avg_dpd_per_delinquency": 2,
"delinquency_ratio_in_pc": 10,
"credit_utilization_ratio": 30,
"number_of_open_accounts": 2,
"residence_type": "Owned",
"loan_purpose": "Home",
"loan_type": "Unsecured"
}- Click Send and receive JSON response:
{
"probability": "0.152340",
"credit_score": 763,
"rating": "Good"
}Alternative test endpoint:
Send a GET request to:
http://localhost:8000/test
Open a new terminal, navigate to the project directory, and run:
streamlit run Frontend/app.pyOpen your browser and go to:
http://localhost:8501
Fill in 11 input fields across 4 rows:
- Age
- Income
- Loan Amount
- LTI (auto-calculated)
- Loan Tenure
- DPD
- Delinquency Ratio
- Credit Utilization
- Open Accounts
- Residence Type
- Loan Purpose
- Loan Type
The system processes inputs and displays:
-
Default Probability
- Green: < 25%
- Yellow: 25–50%
- Red: > 50%
-
Credit Score
- 300–900 scale with position indicator
-
Rating Badge
- Poor / Average / Good / Excellent
- Gradient styling
Expand "How to interpret these results" section for guidance.
Modify inputs and rerun for scenario analysis.
| Endpoint | Method | Description |
|---|---|---|
| /test | GET | Health check endpoint |
| /predict_credit_risk | POST | Submit applicant data and receive credit assessment |
Content-Type: application/json
Body: JSON object with applicant features
{
"probability": "0.152340",
"credit_score": 763,
"rating": "Good"
}-
IV (Information Value) Screening
Retained variables with IV > 0.02 (Weak to Very Strong predictive power) -
VIF Analysis
Removed multicollinear features (VIF < 5 threshold)
- LTI = Loan Amount / Income
- Delinquency Ratio = Delinquent Months / Total Loans
- Avg DPD per Delinquency
- Algorithm: Logistic Regression with L2 regularization
- Optimization: Optuna (Champion) vs Bayesian Optimization (Challenger)
- Resampling: SMOTE and SMOTE + Tomek Links for class imbalance
- Validation: Out-of-time validation (Mar–May 2025) to ensure temporal stability
- Rank Ordering: Borrowers sorted by predicted probability and grouped into deciles
- KS Statistic: Maximum separation between cumulative distributions of defaulters vs non-defaulters (>40 in top 3 deciles)
- Gini Coefficient: Model discrimination power (>85 considered excellent)
- Base Score: 300
- Scale: 600 points (range: 300–900)
- Formula:
Credit Score = 300 + (1 - Default Probability) × 600
- Poor: 300–499
- Average: 500–649
- Good: 650–749
- Excellent: 750–900
- Source: 2 years of loan data from Indian NBFC (Feb 2023 – Feb 2025)
- Train/Validation: Feb 2023 – Feb 2025
- Out-of-Time Test: Mar 2025 – May 2025 (temporal holdout)
- Features: 14 (11 numeric + 3 categorical)
- Data Sources: Personal data, Bureau data, Loan data
Note: Dataset not included in repository due to client confidentiality.
The project uses MLflow for experiment tracking and model management:
- Experiment Logging: Tracks hyperparameters, metrics (Recall, Precision, F1, KS, Gini), and artifacts
- Model Registry: Versions champion and challenger models with stage transitions
- Comparison: Side-by-side metric comparison across Optuna, Bayesian, RF, and XGBoost experiments
To retrain the model with your own data:
- Prepare dataset with the 14 features listed above and target variable
default_flag - Apply IV screening (IV > 0.02) and VIF filtering (VIF < 5)
- Handle class imbalance with SMOTE
- Run Optuna hyperparameter search on Logistic Regression
- Validate on out-of-time data and check KS/Gini benchmarks
- Export model to
model_data2.joblibwith scaler and feature metadata - Copy model to
Artifacts/directory
Contributions are welcome! Please open an issue or submit a pull request for improvements, bug fixes, or feature enhancements.
This project is licensed under the MIT License.
Because trusting your gut with someone else's money is how banks become cautionary tales.










