This project uses machine learning to predict whether a loan application should be approved or not. It aims to assist banks and financial institutions in making data-driven, risk-aware decisions by analyzing past applicant and loan data.
To develop and compare multiple ML models (Logistic Regression, SVM, Decision Tree) that predict the approval status of loan applications based on applicant information like income, credit history, and business value.
- Logistic Regression (Best accuracy: 79.03%)
- Support Vector Machine (SVM) (69.35%)
- Decision Tree (66.13%)
- Data Collection: 1500 cases with 10 numerical + 8 categorical features
- Preprocessing:
- Handling missing values
- Encoding categorical variables (e.g., one-hot encoding)
- Feature scaling and selection
- Model Training: Train/test split + cross-validation
- Evaluation:
- Accuracy
- Precision
- Confusion matrix
- Cross-validation scores
- Logistic Regression performed best, showing high accuracy and simplicity.
- SVM handled nonlinearities well but underperformed LR.
- Decision Tree was most interpretable, though with lower accuracy.
- Python (Jupyter Notebook)
- Libraries: pandas, sklearn, matplotlib, seaborn
- Algorithms: Logistic Regression, SVM, Decision Tree