This is a good example of modeling imbalance data in classification problem.
Problem Statement -
This is an example of simplified classification model. Objective of this exercise is to predict if we can approve car loans for some applicants. Train data provided with TARGET variable 1 or 0. We have to run and check performance on TEST data
Contents - Exploratory Data Analysis
- Looking at Data
- Plot and visualze Categorial Variables
Corelation and plot Neumerical variables
-
Missing Value Stats Phase 1 Modeling
-
Simple way of missing value Imputation
-
Model preperation
-
Random Forest - grid search, hyperparameter tuning and evaluation of best model by cross validation
-
XGboost - hyperparameter tuning,random search and evaluation of best model by cross validation
-
Conclusion from Phase 1 Phase 2 Modeling
-
Missing value imputation by KNN
-
Random Forest
-
XGboost
-
Is there any improvement ?
-
Implementation on Test
-
Conclusion and Future Work