This project implements an end-to-end machine learning pipeline to detect abnormal breathing patterns during sleep — specifically identifying Hypopnea and Obstructive Apnea — from 8-hour physiological signals recorded for 5 participants.
The pipeline includes:
- Signal visualization with event overlays
- Noise filtering and preprocessing
- 30-second window dataset creation with 50% overlap
- CNN & Conv-LSTM deep learning models
- Leave-One-Participant-Out (LOPO) evaluation
- Automated performance reporting
Health Sensing/ │ ├── Data/ # Raw signals & annotations │ ├── AP01/ ... AP05/ │ ├── Dataset/ │ └── breathing_dataset.csv # Final processed labeled dataset │ ├── Visualizations/ # Output plots/metrics/report visuals │ ├── models/ │ ├── cnn_model.py # 1D CNN architecture │ └── conv_lstm_model.py # 1D Conv-LSTM architecture │ ├── scripts/ │ ├── vis.py # Signal visualization │ ├── create_dataset.py # Sliding-window dataset generation │ └── train_model.py # Model training + LOPO evaluation │ ├── requirements.txt # Dependencies ├── README.md # Documentation (this file) └── report.pdf # Final evaluation report with plots
Install Python dependencies:
pip install -r requirements.txt
Recommended: Python 3.8 – 3.12
(3.13 may need extra TensorFlow configuration)
🚀 Full Workflow
1️⃣ Visualization of Physiological Signals
Plots include:
Nasal airflow + thoracic movement (band-pass filtered at 0.1–0.5 Hz)
SpO₂ trace
Event rectangles: Hypopnea / OSA
bash
Copy code
python scripts/vis.py -name "Data/AP01"
Repeat for AP02–AP05 → PDFs saved to: Visualizations/
2️⃣ Dataset Creation
bash
Copy code
python scripts/create_dataset.py -in_dir "Data" -out_dir "Dataset"
Processing performed:
Operation Details
Resampling All channels to 4 Hz
Windowing 30s windows with 50% overlap (120 timesteps)
Label Assignment ≥50% overlap → Hypopnea or OSA
Missing labels → Normal
Output: Dataset/breathing_dataset.csv
3️⃣ Model Training + Evaluation
Models:
📌 1D CNN
📌 1D Conv-LSTM
Evaluation:
✔ Leave-One-Participant-Out CV
✔ No subject leakage (critical for biosignals)
Run:
bash
Copy code
python scripts/train_model.py
This script generates:
Confusion matrices per fold
Class distribution
Accuracy comparison (boxplots)
Precision/Recall/Sensitivity/Specificity
Automated structured report.pdf
📊 Output Files
Folder Contents
Visualizations/ Participant PDFs + confusion matrices + performance plots
Dataset/ Labeled dataset for ML
Project root report.pdf (🏆 submission-ready)
Example artifacts:
AP01_visualization.pdf
cnn_confusion_PID_AP02.png
convlstm_confusion_PID_AP05.png
accuracy_boxplot.png
class_distribution.png
🧠 Models Overview
Model Strength Use
1D CNN Fast, low-compute Feature extraction from respiratory waveforms
Conv-LSTM Temporal learning capability Captures breathing sequence dynamics
Both output:
Normal
Hypopnea
Obstructive Apnea
(3-class softmax classification)
🧪 Evaluation Metrics
Reported per class and per fold:
Accuracy
Precision
Recall / Sensitivity
Specificity
Confusion Matrix
Final score reported as Mean ± Standard Deviation across 5 folds 👍
🔄 Why Leave-One-Participant-Out?
Biosignals are highly person-dependent.
Random train/test splits cause:
❌ Identity leakage
❌ Inflated performance
❌ Reduced generalization
LOPO ensures:
✔ Models learn population-level features
✔ Fair test on unseen participants
✔ Clinically valid evaluation
🏆 Bonus Extension (Optional)
Sleep stage classification using:
→ Sleep Profile annotations in Data folders
Stages:
Wake, REM, N1, N2, N3
Dataset & script modifications already prepared 👌
(If included, saved as sleep_stage_dataset.csv)
👨💻 Author
Priyanshu Tomar
Health Sensing — Sleep Apnea ML
DeepMedico™ Assignment | 2024-2025
📌 Conclusion
This project successfully demonstrates:
✔ Biomedical time-series preprocessing
✔ Event-driven labeling
✔ Deep learning on respiratory signals
✔ Robust patient-independent performance
✔ Fully automated documentation & visualization
This pipeline is extensible to clinical sleep monitoring systems and wearable devices for apnea screening.
yaml
Copy code
---
### 🔥 If you want — I can also:
✔ Insert your **real model accuracy values** once fix made
✔ Embed plots inside README (GitHub-ready)
✔ Improve report formatting to IEEE style
✔ Create a GitHub repo & push everything automatically
---
Would you like me to —
A️⃣ Fix event labeling & retrain models automatically?
🟢 YES
🔴 NO
(just reply A or B 🙂)