Sai Raina, Suraj Shivaramakrishnan
SpeechPath AI: Intelligent Speech Therapy Practice Between Sessions
- 16 million children in the US have communication disorders
- Most need speech therapy 3x per week but only receive 1x per week
- Research shows 30-50% improvement in outcomes with increased practice frequency
- The gap between sessions is where progress stalls—children have nothing meaningful to do at home
SpeechPath AI is an AI-powered practice companion that fills the therapy gap:
- Interactive, game-based scenarios designed by speech-language pathologists
- Real-time AI feedback using speech recognition and motor learning principles
- Child-friendly evaluation with stars, encouragement, and specific guidance
- SLP dashboard showing practice patterns, accuracy trends, and engagement
- Runs on any smartphone (iOS/Android)—no special hardware needed
Not a replacement for therapists. A multiplier for their impact.
- Child opens app, selects a fun practice scenario
- App presents story-based practice (e.g., "Help order pizza, practice /s/ sounds")
- Child speaks into device
- AI analyzes speech in real-time—articulation, clarity, fluency
- Child gets instant, affirming feedback with stars and encouragement
- Therapist sees the data—what sounds mastered, which need work
Speech Analysis
- Whisper (OpenAI) for speech-to-text transcription
- Audio feature extraction: pauses, stuttering indicators, prolonged sounds
- Articulation scoring: onset clarity + spectral quality + confidence
- Fluency scoring: penalty for pauses, stutters, prolonged sounds
- Clarity scoring: signal-to-noise ratio (SNR) analysis
Response Evaluation (LLM-based)
- Uses GPT-4 to generate child-friendly feedback
- Converts audio metrics into clinical judgments
- Ensures affirming, specific, actionable guidance
- Never uses negative language (never "wrong", always "try again")
Scenario Generation (LLM-based)
- Generates personalized practice activities based on:
- Target sound/goal
- Child age and interests
- Disorder type (articulation, fluency, apraxia, language)
- Difficulty level
- Embeds target sounds in meaningful contexts (not isolated drills)
- Creates multi-prompt scenarios with visual supports
Frontend (React Native)
- Mobile app for iOS/Android
- Game-based UI for children
- SLP dashboard preview
Backend (FastAPI)
- RESTful API for all operations
- Audio processing pipeline
- LLM integration (OpenAI GPT-4)
- Session management
- Progress tracking
Core Services
- Speech analysis (local inference for privacy)
- Response evaluation (LLM-based)
- Scenario generation (LLM-based)
- Practice session management
- Progress reporting
- Whisper ASR: Transcribe speech with confidence scores
- Feature Extraction: Detect pauses, stuttering, prolonged sounds
- Scoring: Articulation (0-100), Fluency (0-100), Clarity (0-100)
- Output: Structured metrics for LLM evaluation
- Input: Audio metrics + child age + target sound + disorder type
- Process: GPT-4 generates clinical judgment + child-friendly feedback
- Output: Stars (1-3), praise, specific guidance, encouragement
- Example: 85%+ accuracy → ⭐⭐⭐ "Awesome job! Your /s/ was super clear!"
- Input: Child age + target sound + interests + disorder type + difficulty
- Process: GPT-4 creates engaging practice activities
- Output: Story context + 3-5 practice prompts with phonetic focus + visual supports
- Example: "Pizza Party" scenario for /s/ sound with natural contexts ("Please, may I have a slice?")
- Motor Learning: More practice = brain rewires faster (30-50% improvement per research)
- Engagement: Game-based play keeps kids motivated to practice
- Clinical Validation: Scenarios grounded in speech therapy evidence-based practices
- Scalability: LLM-generated scenarios adapt infinitely without new hardware
[]
SpeechPath AI: Filling the practice gap between sessions. Accessible. Scalable. Evidence-backed.