Skip to content

saitiger/Speech-Path

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 

Repository files navigation

SpeechPath AI

Team Members

Sai Raina, Suraj Shivaramakrishnan

Project Title

SpeechPath AI: Intelligent Speech Therapy Practice Between Sessions


Problem & Solution

The Problem

  • 16 million children in the US have communication disorders
  • Most need speech therapy 3x per week but only receive 1x per week
  • Research shows 30-50% improvement in outcomes with increased practice frequency
  • The gap between sessions is where progress stalls—children have nothing meaningful to do at home

Our Solution

SpeechPath AI is an AI-powered practice companion that fills the therapy gap:

  • Interactive, game-based scenarios designed by speech-language pathologists
  • Real-time AI feedback using speech recognition and motor learning principles
  • Child-friendly evaluation with stars, encouragement, and specific guidance
  • SLP dashboard showing practice patterns, accuracy trends, and engagement
  • Runs on any smartphone (iOS/Android)—no special hardware needed

Not a replacement for therapists. A multiplier for their impact.


How It Works

Child Experience

  1. Child opens app, selects a fun practice scenario
  2. App presents story-based practice (e.g., "Help order pizza, practice /s/ sounds")
  3. Child speaks into device
  4. AI analyzes speech in real-time—articulation, clarity, fluency
  5. Child gets instant, affirming feedback with stars and encouragement
  6. Therapist sees the data—what sounds mastered, which need work

AI Components

Speech Analysis

  • Whisper (OpenAI) for speech-to-text transcription
  • Audio feature extraction: pauses, stuttering indicators, prolonged sounds
  • Articulation scoring: onset clarity + spectral quality + confidence
  • Fluency scoring: penalty for pauses, stutters, prolonged sounds
  • Clarity scoring: signal-to-noise ratio (SNR) analysis

Response Evaluation (LLM-based)

  • Uses GPT-4 to generate child-friendly feedback
  • Converts audio metrics into clinical judgments
  • Ensures affirming, specific, actionable guidance
  • Never uses negative language (never "wrong", always "try again")

Scenario Generation (LLM-based)

  • Generates personalized practice activities based on:
    • Target sound/goal
    • Child age and interests
    • Disorder type (articulation, fluency, apraxia, language)
    • Difficulty level
  • Embeds target sounds in meaningful contexts (not isolated drills)
  • Creates multi-prompt scenarios with visual supports

Technical Architecture

Frontend (React Native)

  • Mobile app for iOS/Android
  • Game-based UI for children
  • SLP dashboard preview

Backend (FastAPI)

  • RESTful API for all operations
  • Audio processing pipeline
  • LLM integration (OpenAI GPT-4)
  • Session management
  • Progress tracking

Core Services

  • Speech analysis (local inference for privacy)
  • Response evaluation (LLM-based)
  • Scenario generation (LLM-based)
  • Practice session management
  • Progress reporting

How AI Is Used

1. Speech Analysis (Audio Processing)

  • Whisper ASR: Transcribe speech with confidence scores
  • Feature Extraction: Detect pauses, stuttering, prolonged sounds
  • Scoring: Articulation (0-100), Fluency (0-100), Clarity (0-100)
  • Output: Structured metrics for LLM evaluation

2. Response Evaluation (LLM)

  • Input: Audio metrics + child age + target sound + disorder type
  • Process: GPT-4 generates clinical judgment + child-friendly feedback
  • Output: Stars (1-3), praise, specific guidance, encouragement
  • Example: 85%+ accuracy → ⭐⭐⭐ "Awesome job! Your /s/ was super clear!"

3. Scenario Generation (LLM)

  • Input: Child age + target sound + interests + disorder type + difficulty
  • Process: GPT-4 creates engaging practice activities
  • Output: Story context + 3-5 practice prompts with phonetic focus + visual supports
  • Example: "Pizza Party" scenario for /s/ sound with natural contexts ("Please, may I have a slice?")

Why This Works

  • Motor Learning: More practice = brain rewires faster (30-50% improvement per research)
  • Engagement: Game-based play keeps kids motivated to practice
  • Clinical Validation: Scenarios grounded in speech therapy evidence-based practices
  • Scalability: LLM-generated scenarios adapt infinitely without new hardware

Demo Video

[]

SpeechPath AI: Filling the practice gap between sessions. Accessible. Scalable. Evidence-backed.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors