What It Does

EchoAccess converts inaccessible web forms into a guided voice conversation:

  1. Sign in with Supabase Auth
  2. Select a form (TD Bank, TTC Disability Card, CRA Benefits)
  3. Gemini parses the HTML and extracts every field with plain-English labels
  4. EchoAccess asks questions one at a time, conversationally, via voice
  5. You answer by speaking or typing
  6. Backboard remembers you so returning users get pre-filled suggestions
  7. Confirm and submit after hearing a full summary read back

Three real Canadian forms. Fully voice-navigable. Zero mouse required.


Inspiration

Over 1.5 million Canadians live with significant vision loss, and the forms they depend on most (banking, transit, government benefits) are the hardest to use. Screen readers choke on nested dropdowns. Tab order is unpredictable. One wrong keystroke resets an entire application.

We asked ourselves: "Who am I leaving out?"

The answer was clear. EchoAccess was built on one idea: filling out a form should be as simple as having a conversation.


How We Built It

Voice Conversation Pipeline

STEP SERVICE WHAT HAPPENS
1. Parse Form Gemini 2.5 Flash Raw HTML ingested, every field extracted with plain-English labels, input types identified (text, select, radio, date, etc.)
2. Create Session Supabase + Backboard.io Per-user assistant and thread created with row-level security. Returning users matched to existing memory profile
3. Generate Question Gemini 2.5 Flash Next unanswered field converted into a natural, conversational voice prompt with context from previous answers
4. Speak Question ElevenLabs TTS Question synthesized into natural-sounding speech using ElevenLabs text-to-speech API
5. Capture Answer ElevenLabs STT User's spoken response transcribed in real-time. Text input supported as fallback
6. Store Answer Backboard.io Memory Answer persisted to long-term user profile. Carries across all forms (bank, transit, CRA)
7. Read Summary ElevenLabs TTS + Gemini All answers compiled into plain-English summary, read aloud for confirmation before submission

Per-User Memory System

LAYER TECHNOLOGY WHAT IT DOES
Auth Supabase Auth (JWT) Secure sign-up/login with email. JWT passed on every API call
Session Mapping Supabase PostgreSQL user_sessions table maps each user_id to a dedicated Backboard assistant_id + thread_id
Isolation Row-Level Security Supabase RLS ensures no user can access another user's session or memory
Long-Term Memory Backboard.io REST API Stores name, email, address, SIN, and more. Pre-fills across forms on return visits

Frontend Accessibility Stack

FEATURE IMPLEMENTATION WHY IT MATTERS
Voice Output ElevenLabs TTS API Every question read aloud in natural-sounding voice, no screen required
Voice Input ElevenLabs STT API Hands-free answering by speaking naturally
Screen Reader Support aria-live="polite" on chat log New messages announced automatically
Interactive Labels aria-label on all controls Every button and input identified by assistive tech
Progress Tracking aria-current="step" on active field Users always know where they are in the form
Keyboard Navigation Full Tab + Enter support Entire flow navigable without a mouse
Animations Framer Motion Smooth transitions that respect prefers-reduced-motion

Tech Stack

LAYER TECHNOLOGY
Backend FastAPI + Python 3.11+
Auth Supabase Auth (JWT)
Database Supabase PostgreSQL
LLM Gemini 2.5 Flash
Voice ElevenLabs TTS + STT
Memory Backboard.io
Frontend React 19 + Vite + TypeScript
UI shadcn/ui + Tailwind CSS v4 + Framer Motion

Demo Forms

FORM FIELDS REAL-WORLD USE CASE
TD Bank Account Application Personal info, SIN, employment, income Opening a bank account without visiting a branch
TTC Disability Discount Card Accessibility needs, ODSP/CPP status, disability type Applying for subsidized transit in Toronto
CRA Benefits Application SIN, marital status, dependents, direct deposit Claiming government benefits independently

API Reference

METHOD ENDPOINT AUTH DESCRIPTION
GET /api/health Health check + Backboard status
GET /api/forms Bearer List available forms
POST /api/parse-form Bearer Parse form HTML into structured fields
POST /api/new-session Bearer Create or reuse per-user Backboard thread
POST /api/chat Bearer Generate next conversational question
POST /api/save-answer Bearer Save field answer to memory
GET /api/user-profile Bearer Retrieve stored user profile
POST /api/submit-form Bearer Generate plain-English summary

Challenges We Ran Into

Parsing real-world form HTML was messier than expected; each form had different structures, nested fieldsets, and inconsistent labeling, so we leaned heavily on Gemini's ability to interpret messy markup. Integrating ElevenLabs for both TTS and STT required careful audio streaming and latency management to keep the conversation feeling natural and responsive. Getting per-user memory isolation right with Supabase + Backboard required careful session management so one user's data never leaked into another's conversation thread.


Accomplishments We're Proud Of

We built a fully functional accessibility tool that handles three real Canadian government and banking forms end-to-end with voice. The cross-form memory system means a user who fills out the TD Bank application never has to re-state their address for the TTC card. EchoAccess asks "I have your address on file as 123 Main St, shall I use that?" and moves on. The ElevenLabs integration gives the voice assistant a natural, human-like quality that makes the experience feel like talking to a real support agent, not a robotic screen reader. We're also proud of the per-user isolation architecture: every user gets their own Backboard assistant and thread, with Supabase row-level security ensuring zero data leakage.


What We Learned

True accessibility goes far beyond adding aria-label to buttons. It means rethinking the entire interaction model from the ground up. We learned how powerful Gemini 2.5 Flash is at structured extraction from messy, inconsistent HTML, and how Backboard.io's persistent memory transforms a stateless chatbot into something that genuinely feels like it knows you. Working with ElevenLabs taught us how much voice quality matters for accessibility; the difference between synthetic TTS and ElevenLabs' natural voices is the difference between a user tolerating the tool and actually enjoying it.


What's Next for EchoAccess

We want to expand beyond three demo forms to support any arbitrary web form via URL input. Paste a link, and EchoAccess parses and voice-guides it automatically. We're also exploring multilingual support (French, Mandarin, Punjabi) for Canada's diverse population, and integration with government digital identity systems for secure pre-population. Long-term, we see EchoAccess as a browser extension that overlays on any form in real-time.


Built at Hack the Six 2026.

Built With

Share this project:

Updates