This project is aimed at building a ML model to predict whether an applicant is approved for a loan. The project includes data preprocessing, exploratory data analysis (EDA), model training, and deployment using Docker.
The csv file includes 58645 records with 13 columns:
- id
- person_age
- person_income
- person_home_ownership
- person_emp_length
- loan_intent
- loan_grade
- loan_amnt
- loan_int_rate
- loan_percent_income
- cb_person_default_on_file
- cb_person_cred_hist_length
- loan_status
The project's EDA and model training is documented in the Notebook
Model is trained through running this script. The trained model is saved in xgboost_model.pkl and dv.pkl
The notebook is developed using python 3.9.20 with miniconda. The entire dependencies is extracted to environment.yml file.
The training script can be run with python + venv:
python -m venv my-venv
source my-venv/bin/activate
pip install --no-cache-dir -r requirements.txt
python train.py
The model is deployed with Flask in app.py and using docker for containerization.
Build docker image:
docker build -t loan-approval-proj .
Run the image:
docker run --rm -p 5000:5000 loan-approval-proj
The model result can be tested directly with predict.py
Or by calling the containerized service with:
curl -i -X POST http://localhost:5000/predict -H 'Content-Type: application/json' -d '{"loan_grade": "C", "person_home_ownership": "RENT", "loan_percent_income": 0.07, "loan_intent": "MEDICAL", "person_income": 30000, "person_emp_length": 0.0}'
