Diabetes Onset Prediction — Logistic Regression + Hyperparameter Tuning

Problem. Predict diabetes onset from clinical features to support early intervention.

Data. ./diabetes_clean.csv (columns include: pregnancies, glucose, diastolic, triceps, insulin, bmi, dpf, age, diabetes).

Approach.

Baseline Logistic Regression; ROC curve + confusion matrix.
Compared against KNN (logistic performed better across metrics).
Hyperparameter tuning with GridSearchCV and RandomizedSearchCV.
Evaluated with train/test split; tracked accuracy, precision, recall, F1, ROC-AUC.

Results.

Logistic > KNN on all reported metrics.
ROC-AUC ≈ 0.801; accuracy ≈ 0.68; balanced performance across classes.

What I Learned.

Interpreting coefficients and thresholds via ROC.
Why CV-based tuning improves generalization.
How metric choice (F1 vs AUC) shifts model selection.

Quick Start

# clone if standalone
git clone https://github.com/Joe-Naz01/fine_tuning_supervised.git
cd fine_tuning_supervised

python -m venv .venv
source .venv/bin/activate   # Windows: .venv\Scripts\activate
pip install -r requirements.txt
jupyter notebook

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
README.md		README.md
diabetes_clean.csv		diabetes_clean.csv
fine_tuning.ipynb		fine_tuning.ipynb
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Diabetes Onset Prediction — Logistic Regression + Hyperparameter Tuning

Quick Start

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Diabetes Onset Prediction — Logistic Regression + Hyperparameter Tuning

Quick Start

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages