2025TorontoHealthDatathon

Sleep-disordered breathing (SDB), including obstructive sleep apnea (OSA), is a prevalent yet underdiagnosed condition with significant health implications. This condition affects an estimated 1 billion individuals worldwide, with moderate-to-severe OSA present in approximately 425 million people. Despite its significant health implications, including increased risk of cardiovascular disease, metabolic disorders, neurocognitive impairment, and reduced quality of life, OSA remains undiagnosed in up to 80% of affected individuals. Diagnosis is often delayed due to a complex and lengthy healthcare pathway, including waiting times to see a primary care physician, specialist referrals, and polysomnography (PSG) testing. In many healthcare systems, patients may wait several months to years for a formal diagnosis, particularly in regions with limited access to sleep specialists and diagnostic facilities.

This study examines the predictive value of self-reported measures—such as demographics, comorbidities, and the STOP-BANG questionnaire—for objective sleep study metrics like the Respiratory Disturbance Index (RDI) and Apnea-Hypopnea Index (AHI). Self-reported measures offer a promising approach for screening and risk stratification, potentially reducing unnecessary polysomnography (PSG) referrals and improving diagnostic prioritization. Their predictive capability also enhances cost-effectiveness and accessibility, ensuring that high-risk individuals receive timely diagnostic testing while minimizing healthcare expenditures. Early identification of patients at risk for moderate-to-severe OSA facilitates early intervention and disease prevention, potentially reducing complications such as cardiovascular disease and cognitive decline.

While the STOP-BANG questionnaire provides a validated risk score, machine learning (ML)-based models can improve predictive accuracy by learning complex patterns in the data, dynamically weighting features, and integrating additional predictors such as body mass index (BMI), neck circumference, comorbidities, and biomarkers. ML models also enable multimodal data integration, incorporating self-reported information with wearable device data (e.g., Fitbit, Apple Watch), electronic health records (EHRs), and imaging (e.g., upper airway MRI or CT scans). Unlike traditional categorical scoring, ML models provide continuous risk estimation, allowing for personalized thresholds and more precise risk stratification. Additionally, ML models offer feature importance analysis and explainability using methods like SHAP values and decision trees, improving clinical interpretability. These models can be adapted to specific populations, addressing demographic and physiological variations, and can be automated and scaled via clinical decision support systems (CDSS) and mobile health (mHealth) applications for real-time risk assessment.

Leveraging machine learning in predictive modeling could significantly enhance clinical decision-making, optimize healthcare resource allocation, and improve access to timely OSA diagnosis and management.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
2025_Toronto_Health_Datathon.ipynb		2025_Toronto_Health_Datathon.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

2025TorontoHealthDatathon

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

2025TorontoHealthDatathon

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages