AI-powered Bhagavad Gita Sanskrit pronunciation coach. Record your recitation → get word-level AI feedback → hear the correct pronunciation → perfect your Sanskrit.
VaakSiddhi is a fusion of three linguistic traditions:
| Part | Language | Meaning |
|---|---|---|
| वाक् (Vaak) | Sanskrit | Speech, Voice, The power of expression |
| सिद्धि (Siddhi) | Sanskrit/Hindi | Mastery, Perfection, Attainment |
| VaakSiddhi | Trilingual | "Mastery of Voice" |
It echoes naturally across Hindi, Sanskrit, and English speakers. In Hindu philosophy, Vaak-Siddhi is literally the spiritual power of perfect speech — exactly what this app helps you achieve.
- What It Does
- Why It Exists
- Architecture
- Tech Stack & Costs
- Project Structure
- Getting Started
- Deploy for Free
- Scaling Guide
- API Reference
- The AI Feedback Engine
- Pronunciation Audio — Upgrade Path
- Shloka Database
- Bugs Fixed
- Roadmap
- FAQ
VaakSiddhi is a mobile-first web app that turns your phone into a personal Sanskrit pronunciation guru.
You recite a shloka → AI listens → Hear your own voice back
→ AI tells you exactly what went wrong → Hear correct Devanagari TTS → You improve
| Feature | Description |
|---|---|
| 🎙 Voice Recording | Record yourself; hear your own voice back with a progress player |
| 🧠 AI Phonetic Analysis | Multi-provider cascade: Groq → Gemini → Heuristic fallback |
| 📊 Score + Grade | 0–100 score with letter grade (A+ to F) after each attempt |
| 📝 Word-Level Feedback | Exact words mispronounced with Devanagari badge + IAST reference |
| 🔊 Authentic TTS | Hear correct pronunciation in actual Devanagari script (hi-IN voice) |
| 💡 Actionable Tips | 3 specific improvement tips per session |
| 🔤 Phonetic Breakdown | Syllable-by-syllable guide for the hardest mispronounced word |
| 📚 Sanskrit Rules | One phonetics rule explained per session |
| 🇮🇳 Hindi Meanings | Full Devanagari Hindi translation of every shloka |
| 🌐 English Meanings | English translation for all verses |
| 📈 Progress Tracking | Session history, best scores, streaks, stats dashboard |
| 🧠 Spaced Repetition | SM-2 algorithm schedules shlokas for optimal review timing |
| ✨ Daily Sanskrit Word | A rotating Sanskrit word from the Gita every day |
| 📱 PWA / Offline | Installs on homescreen; shloka library works offline |
| 🔄 Scroll Memory | Library scroll position and filters persist across navigation |
Sanskrit has ~50 phonemes vs English's ~44. Many Sanskrit sounds simply don't exist in English or Hindi:
| Sound | Example | Correct articulation |
|---|---|---|
| ā (long a) | kāma, dhāraṇā | Like "father" — held twice as long as short 'a' |
| ṭ, ḍ, ṇ (retroflex) | kaṭhina, ḍamaru | Tongue curled back to the palate |
| ṣ (retroflex sibilant) | kṛṣṇa, viṣṇu | Harder than English "sh" — tongue further back |
| ḥ (visarga) | duḥkha, namaḥ | Soft echo-breath after the vowel |
| ṃ/ṁ (anusvara) | saṃskāra | Nasal hum — not a full "n" |
Getting these wrong doesn't just sound odd — in Sanskrit, vowel length and consonant type are phonemic. Short vs long vowel = different word entirely.
Existing apps don't solve this:
- Vyoma/SanskritFromHome → Listen only, no feedback on your voice
- SGS Gita Tutor → Excellent audio library, zero recording feature
- Bhagavad Gita apps → Reference only, no pronunciation coaching
VaakSiddhi closes the feedback loop that's been missing from every Sanskrit learning tool.
See docs/architecture.svg for the full visual diagram.
┌──────────────────────────────────────────────────────────────────────────────┐
│ USER (Browser / Mobile — Chrome/Edge) │
└──────────────────────────────────┬───────────────────────────────────────────┘
│
┌────────────────────────┼──────────────────────────┐
▼ ▼ ▼
🎤 Web Speech API 🎙 MediaRecorder 🔊 Speech Synthesis
(hi-IN, 3 alternatives, (WebM/Opus, blob URL, (Devanagari TTS,
continuous, gotAnyAudio progress player, hi-IN, rate 0.45–0.6,
guard, manual fallback) replay fixed) authentic Sanskrit)
│ │ │
└────────────────────────┼──────────────────────────┘
│
⚡ Service Worker (PWA, cache-first, offline-safe)
│
┌──────────────────────────────────▼───────────────────────────────────────────┐
│ React 18 + Vite Frontend │
│ ErrorBoundary → HomeScreen · LibraryScreen · PracticeScreen · ResultsScreen │
│ Shared UI: ScoreRing · WaveformBars · Chip · Card · DifficultyBadge │
│ Lazy-loaded shlokas.json · sessionStorage scroll + filter memory │
└──────────────────────────────────┬───────────────────────────────────────────┘
│ POST /api/analyze + X-Request-ID header
┌──────────────────────────────────▼───────────────────────────────────────────┐
│ FastAPI Backend (Railway / Render) │
│ Rate limiter (20 RPM/IP) · Response cache (MD5, 10 min TTL) │
│ Pydantic v2 validation · CORS hardened · Structured logging + request IDs │
└──────┬──────────────────────┬────────────────────────────────┬───────────────┘
│ │ │
▼ ▼ ▼
① GROQ (Primary) ② GEMINI (Backup) ③ HEURISTIC (Fallback)
llama-3.3-70b gemini-2.5-flash Rule-based word match
14,400 req/day 15 RPM free Always works, no API needed
JSON mode responseMimeType:json Provider health tracker
↓ Response includes: Devanagari form per mistake word
┌─────────────────────────────┐ ┌────────────────────────────────────────┐
│ shlokas.json (Static) │ │ Progress Store │
│ 700 verses · Sanskrit │ │ Stage 1: localStorage │
│ Hindi · IAST · English │ │ Stage 2: Supabase PostgreSQL │
│ Lazy-loaded · CDN-cached │ │ SRS schedule · Streak · History │
└─────────────────────────────┘ └────────────────────────────────────────┘
1. Provider Abstraction Layer
All STT and LLM calls go through service wrappers — swap any provider in one line:
// src/services/llm.js — change backend URL to switch AI provider
const BACKEND_URL = import.meta.env.VITE_BACKEND_URL || "http://localhost:8000";
// backend/main.py — cascade tries providers in order
providers = [("groq", call_groq), ("gemini", call_gemini)]
# Comment out a provider or add a new one — frontend never changes2. Static Shloka Database
All shlokas = one JSON file. Zero DB queries for content. Lazy-loaded via import() so the app renders instantly without waiting for 16MB of data.
3. Graceful Degradation
- No mic permission → read shlokas, view meanings
- Offline → full shloka library (PWA cache), SRS queue still works
- AI API down → heuristic fallback still gives general coaching
- Network error during speech recognition → auto-switches to manual text input
4. Request ID Tracing
Every /api/analyze call gets a unique X-Request-ID header. The backend logs every step with rid= so any request can be traced end-to-end:
2026-03-28T14:22:01 INFO [vaaksiddhi] analyze ip=127.0.0.1 rid=vs-m3x8a-k2pq9 translit_len=142
2026-03-28T14:22:03 INFO [vaaksiddhi] cascade provider=groq score=74 rid=vs-m3x8a-k2pq9
| Layer | Technology | Cost |
|---|---|---|
| Frontend | React 18 + Vite | Free |
| Hosting | Vercel | Free |
| STT | Web Speech API (hi-IN) | Free |
| TTS | Browser Speech Synthesis + Devanagari | Free |
| AI Feedback | Groq free tier (14,400 req/day) | Free |
| Shloka DB | Static JSON (lazy-loaded) | Free |
| Offline | Service Worker PWA | Free |
| Progress | localStorage + SRS | Free |
| TOTAL | ₹0 |
| Layer | Technology | Cost |
|---|---|---|
| Backend | FastAPI on Railway | $7/month |
| STT | OpenAI Whisper API | ~$18/month |
| AI | Groq paid / Gemini Pro | ~$15/month |
| DB | Supabase Pro | $25/month |
| TOTAL | ~$65/month |
| Layer | Technology | Cost |
|---|---|---|
| STT | Self-hosted Whisper (RunPod A10G) | ~$90/month |
| AI | Claude Sonnet or GPT-4o | ~$120/month |
| Backend | Railway Pro or AWS ECS | $25/month |
| DB | Supabase Pro | $25/month |
| TOTAL | ~$260/month |
vaaksiddhi/
│
├── src/ # React frontend
│ ├── App.jsx # Root routing (~80 lines, was 932)
│ ├── main.jsx # Entry point + ErrorBoundary wrapper
│ │
│ ├── components/
│ │ ├── ErrorBoundary.jsx # Catches all React crashes, shows reload
│ │ ├── HomeScreen.jsx # Dashboard: stats, daily word, history
│ │ ├── LibraryScreen.jsx # Shloka browser: lazy JSON, scroll memory
│ │ ├── PracticeScreen.jsx # Recording + STT + MediaRecorder
│ │ ├── ResultsScreen.jsx # Analysis display + TTS + recording playback
│ │ └── ui/
│ │ └── index.jsx # Shared: Chip, Card, ScoreRing, WaveformBars…
│ │
│ ├── data/
│ │ └── shlokas.json # 700 Bhagavad Gita verses (lazy-loaded)
│ │
│ ├── services/
│ │ ├── audio.js # Web Speech API + MediaRecorder abstraction
│ │ ├── llm.js # Backend proxy + makeRequestId + local fallback
│ │ └── storage.js # localStorage + SRS (SM-2) + streak + settings
│ │
│ └── styles/
│ └── tokens.js # Design tokens (colors) + global CSS
│
├── backend/ # FastAPI
│ ├── main.py # Routes · cascade · rate limit · cache · logging
│ ├── requirements.txt # Python deps (pydantic>=2.9, fastapi>=0.115)
│ └── .env # NOT committed — copy from .env.example
│
├── public/
│ ├── sw.js # Service Worker: cache-first, API network-only
│ ├── manifest.json # PWA manifest: standalone, theme #0D0818
│ └── om.svg # App icon
│
├── docs/
│ └── architecture.svg # Full system architecture diagram (this file)
│
├── index.html # PWA meta tags + service worker registration
├── vite.config.js # Vite build config
├── .env.example # Template — copy to backend/.env
└── README.md # This file
- Node.js 18+ — nodejs.org
- Python 3.9+ — python.org
- A free Groq API key — console.groq.com (30 seconds, no credit card)
- Chrome or Edge — required for Web Speech API (Safari/Firefox not supported)
# 1. Clone
git clone https://github.com/yourusername/vaaksiddhi
cd vaaksiddhi
# 2. Install frontend deps
npm install
# 3. Start frontend (no backend needed for basic use)
npm run dev
# → http://localhost:5173cd backend
# Create virtual environment
python -m venv venv
source venv/bin/activate # Windows: venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt
# Create .env file
cp ../.env.example .env
# Edit .env and add:
# GROQ_API_KEY=gsk_... ← from console.groq.com (free)
# GEMINI_API_KEY=AIza... ← from aistudio.google.com (optional backup)
# FRONTEND_URL=http://localhost:5173
# Start backend
uvicorn main:app --reload --port 8000
# → API: http://localhost:8000
# → Docs: http://localhost:8000/docsThe frontend automatically points to http://localhost:8000 in development.
npm i -g vercel
vercel
# In Vercel dashboard → Settings → Environment Variables:
# VITE_BACKEND_URL = https://your-backend.railway.app- Push to GitHub
- railway.app → New Project → Deploy from GitHub → select
backend/ - Add environment variables:
GROQ_API_KEY,FRONTEND_URL - Railway auto-deploys from
mainbranch
Cost: Free hobby tier (500 hrs/month). $5/month for always-on.
| Trigger | Upgrade | Cost delta |
|---|---|---|
| Users complain about transcription | Web Speech → Whisper API | +$18/mo per 10K users |
| Want cross-device sync | localStorage → Supabase | +$25/mo (free up to 50K rows) |
| Need better AI quality | Groq → Claude Sonnet | +~$20/mo at moderate volume |
| STT costs >$100/mo | Whisper API → self-hosted GPU | Flat ~$90/mo, any volume |
| 1M+ users | Multi-region AWS | Custom — talk to us |
Main pronunciation analysis endpoint.
Request:
{
"expected_transliteration": "karmaṇy-evādhikāras te mā phaleṣu kadācana",
"spoken_transcript": "karmanye vadhikaraste ma phaleshu kadachana",
"spoken_alternatives": ["karma vadhikar", "karmanye vadikaraste"],
"shloka_english": "You have a right to perform your prescribed duties...",
"hard_sounds": ["karmaṇy", "phaleṣu"]
}Response:
{
"score": 78,
"grade": "B+",
"overall": "Strong attempt — your rhythm is excellent, focus on vowel lengths.",
"praise": "Perfect pause placement between pādas.",
"mistakes": [
{
"word": "karmaṇy",
"devanagari": "कर्मण्य",
"issue": "The 'ṇ' is retroflex — tongue must curl back to the palate"
}
],
"tips": ["For ā: say 'ah' as in father, hold twice as long as short a"],
"phonetic_guide": {
"word": "karmaṇy",
"devanagari": "कर्मण्य",
"breakdown": "kar-mun-yuh — 'u' is very short, 'yuh' ends softly",
"example": "Similar to 'car' + 'mun' + quick 'yuh'"
},
"sanskrit_rule": "Sanskrit has 5 nasal sounds (ṅ ñ ṇ n m) — each is phonemically distinct.",
"encouragement": "अभ्यासेन तु कौन्तेय — Through practice, all is achieved. (BG 6.35)",
"provider": "groq",
"cached": false
}Key change from v1: mistakes are now objects { word, devanagari, issue } instead of strings, enabling authentic Devanagari TTS playback.
Returns today's Sanskrit word from a rotating curated list of 15 Gita terms.
Provider status, cache count, and timestamps.
API info, configured providers, and signup links.
Request → Backend → [Rate limit check] → [Cache lookup]
│ cache miss
▼
[Groq healthy?] ─yes→ call_groq()
│ no / fails
▼
[Gemini healthy?] ─yes→ call_gemini()
│ no / fails
▼
heuristic_analysis() ← always works
Provider health is tracked over a 5-minute window. If a provider fails 3 times, it's automatically skipped for the rest of the window.
The AI persona is Guru Vaak — a warm, technically precise Sanskrit phonetics expert who:
- Celebrates what the student did well (confidence-building)
- Returns the Devanagari form of every mispronounced word alongside IAST
- Explains the Sanskrit phonetics rule behind each correction
- Accounts for Web Speech API's unreliability with Sanskrit by using all 3 transcript alternatives together
- Returns valid JSON with
response_format: json_object(Groq) /responseMimeType: application/json(Gemini)
Chrome's Web Speech API returns up to 3 different transcription guesses per segment. VaakSiddhi collects all alternatives across all result segments and sends them to the LLM:
Primary: "dharma kshetre kuru kshetre"
Alt 1: "dharm kshetra kuru kshetra"
Alt 2: "dharma chitra kuru kshetra"
→ LLM uses all three together to infer actual pronunciation
The current TTS uses the browser's built-in SpeechSynthesisUtterance with hi-IN voice and Devanagari text. This is decent but uses a Hindi voice model — not a Sanskrit one. For authentic pronunciation, here are the upgrade options in order of effort:
India's government-funded API specifically built for Indian languages including Sanskrit. Has voices recorded by native Sanskrit speakers.
# backend/main.py — add this route
@app.get("/api/tts")
async def tts(text: str, lang: str = "sa"):
async with httpx.AsyncClient() as client:
resp = await client.post(
"https://dhruva-api.bhashini.gov.in/services/inference/pipeline",
headers={"Authorization": BHASHINI_KEY},
json={"pipelineTasks": [{"taskType": "tts", "config": {"language": {"sourceLanguage": lang}}}],
"inputData": {"input": [{"source": text}]}}
)
return {"audio_base64": resp.json()["pipelineResponse"][0]["audio"][0]["audioContent"]}Sign up at bhashini.gov.in → API key is free for Indian developers.
Google has an actual Sanskrit voice model (sa-IN-Standard-A). The WaveNet version sounds distinctly different from Hindi.
from google.cloud import texttospeech
client = texttospeech.TextToSpeechClient()
synthesis_input = texttospeech.SynthesisInput(text=devanagari_text)
voice = texttospeech.VoiceSelectionParams(language_code="sa-IN", name="sa-IN-Standard-A")First 1M characters/month are free; ₹0 for most small apps.
Purpose-built for Indian languages. Has dedicated Sanskrit support with nuanced prosody. Sign up at sarvam.ai.
Record a Sanskrit teacher or use freely licensed recordings from spokensanskrit.org or the AI4Bharat corpus. Map recordings to shloka IDs in shlokas.json. Zero runtime cost, perfect pronunciation.
| Text given to TTS | Voice | Sounds like |
|---|---|---|
IAST Latin: dhṛtarāṣṭra |
hi-IN | Garbled English ❌ |
Devanagari: धृतराष्ट्र |
hi-IN | Accented Hindi — recognisable ✅ |
Devanagari: धृतराष्ट्र |
sa-IN (Google) | Authentic Sanskrit ✅✅ |
| Pre-recorded audio | Pandit | Perfect ✅✅✅ |
src/data/shlokas.json is lazily imported to avoid blocking first render. Schema:
{
"id": "BG2.47",
"chapter": 2,
"verse": 47,
"chapter_name": "Sankhya Yoga",
"title": "Chapter 2, Verse 47",
"sanskrit": "कर्मण्येवाधिकारस्ते...",
"transliteration": "karmaṇy-evādhikāras te...",
"hindi": "तुम्हारा अधिकार केवल कर्म करने में है...",
"english": "You have a right to perform your duties...",
"difficulty": "beginner",
"keywords": ["karma", "adhikara", "phala"],
"hard_sounds": ["karmaṇy", "phaleṣu", "hetur"]
}A complete list of every bug discovered and resolved during development:
| # | Bug | Root cause | Fix |
|---|---|---|---|
| 1 | Pydantic build fails on Python 3.13 | pydantic-core 2.14.1 called ForwardRef._evaluate() with the wrong number of arguments — Python 3.13 changed the API |
Updated requirements.txt from pinned == versions to >= ranges (pydantic>=2.9.0, fastapi>=0.115.0) |
| 2 | App.jsx was 932 lines | Single monolithic file with all screens, logic, and styles | Split into 7 focused files: HomeScreen, LibraryScreen, PracticeScreen, ResultsScreen, ErrorBoundary, ui/index, tokens.js |
| 3 | API keys exposed in browser | Claude was called directly from the frontend with VITE_ANTHROPIC_API_KEY visible to anyone opening DevTools |
All AI calls now route through the FastAPI backend; keys are server-only env vars |
| 4 | CORS wildcard in production | allow_origins=["*"] accepted requests from any origin |
CORS now reads FRONTEND_URL env var; falls back to localhost only |
| 5 | 16MB JSON blocking first render | shlokas.json was statically imported at module load |
Replaced with dynamic import("../data/shlokas.json") inside useEffect — renders instantly with spinner |
| 6 | Dual recognizer abort bug | Adding a second en-IN SpeechRecognition instance alongside hi-IN caused Chrome to abort both immediately with error: aborted |
Reverted to single hi-IN recognizer; collect all 3 built-in alternatives via event.results[i][j] loop |
| 7 | "No speech detected" false positive | recognition.onend fired immediately after start() (browser quirk on some inputs) — triggered onResults with empty transcript before user spoke |
Added gotAnyAudio boolean guard: onEnd only calls onResults if at least one final result segment was received |
| 8 | App freezes when offline | Web Speech API requires internet (streams to Google). error: network fired but onEnd never followed → PracticeScreen stuck in recording state |
Caught error: network in onerror, set shouldRestart = false, auto-switched to manual text input with helpful message |
| 9 | No crash recovery | Any unhandled React render error crashed the white-screen with no user option to recover | Added ErrorBoundary.jsx class component wrapping the entire app tree; shows "Reload App" button on crash |
| 10 | Recording playback URL lost | playbackUrl stored in React state; stopAndAnalyze navigated to ResultsScreen before state updated → Results never received the URL |
Used playbackUrlRef (useRef, synchronous) to capture the blob URL immediately; forwarded through onResults payload |
| 11 | WebM blob duration = Infinity | Chrome's MediaRecorder doesn't write a duration header into WebM files → audio.duration === Infinity → progress bar broken |
On loadedmetadata, seek to currentTime = 1e9 to force the browser to scan to the real end and set correct duration |
| 12 | Replay after stop didn't work | After onEnded, audio.currentTime === audio.duration. Calling play() again fired ended instantly. currentTime was never reset |
toggle() now always sets currentTime = 0 before play(); awaits the Promise and catches errors |
| 13 | TTS sounded like English | SpeechSynthesisUtterance was fed IAST Latin text (dhṛtarāṣṭra) — browser reads it as garbled English |
Changed all TTS calls to use Devanagari script (shloka.sanskrit); LLM prompt updated to return devanagari field for each mistake word |
| 14 | makeRequestId declared, never used |
IDE warning; request IDs existed client-side but were not sent to the backend | Wired into fetch headers as X-Request-ID; backend updated to extract and log with rid= in every structured log line |
| 15 | No scroll/filter memory in Library | Every time you navigated back from Practice, the library reset to top with no filters | Persisted scroll position and active filter to sessionStorage; restored on mount |
| 16 | No spaced repetition | Users had to manually decide which shloka to review | Implemented SM-2 simplified algorithm in storage.js: score≥80 doubles interval (max 30 days), score≥60 keeps 3-day minimum, score<60 resets to 1 day |
- React 18 + Vite frontend, split into focused components
- FastAPI backend with Groq → Gemini → Heuristic cascade
- Voice recording with waveform visualization + blob URL playback
- Hear your own recording back with progress bar
- Devanagari TTS for correct pronunciation + per-mistake word TTS
- Score + grade system (0–100, A+–F)
- Word-level mistakes with Devanagari badge
- Phonetic breakdown for hardest word
- 3 improvement tips per session
- Progress tracking with SM-2 spaced repetition
- PWA: installable, offline-capable, service worker
- Daily Sanskrit word (15 rotating Gita terms)
- Scroll memory + filter persistence in Library
- ARIA labels, keyboard accessible
- ErrorBoundary for crash recovery
- Request ID tracing end-to-end
- Structured Python logging with
rid=per request
- All 700 shlokas across 18 chapters
- Chapter summaries and context
- Favourite shlokas
- Share results card (Instagram-friendly)
- Bhashini API TTS backend route (
GET /api/tts) - Google Cloud TTS
sa-INvoice integration - Side-by-side waveform: your voice vs correct pronunciation
- 0.5x slowdown mode for difficult passages
- Supabase cross-device sync
- Google/Apple login (Supabase Auth)
- Whisper API for better STT accuracy
- Personalized weak-spot detection
- 30-day Sanskrit learning curriculum
- Push notification reminders (PWA)
- React Native iOS/Android apps
- Yoga Sutras, Upanishads, Hanuman Chalisa
- Sanskrit keyboard input
- Teacher mode: assign shlokas, view student scores
- Premium tier (₹99/month)
| System | WER | Cost | Notes |
|---|---|---|---|
| Web Speech API hi-IN | ~35% | Free | Good enough for MVP with alternatives |
| OpenAI Whisper (base) | ~25% | $0.006/min | Better, but still not Sanskrit-specific |
| AI4Bharat Whisper Sanskrit | ~15.4% | Self-host | Best accuracy, needs GPU |
| Bhashini ASR | ~18% | Free | Indian govt, Sanskrit trained |
WER of 15–35% means AI correctly catches most errors that matter — wrong consonant type, wrong vowel length — the exact errors that most affect recitation quality.
| Product | Recording? | AI feedback? | Sanskrit specific? | Gita specific? |
|---|---|---|---|---|
| Vyoma SanskritFromHome | ❌ | ❌ | ✅ | ✅ |
| SGS Gita Tutor | ❌ | ❌ | ✅ | ✅ |
| Vidya.AI (beta) | ✅ | Partial | ✅ | ❌ |
| Duolingo (Sanskrit) | ❌ | ❌ | ✅ | ❌ |
| VaakSiddhi | ✅ | ✅ | ✅ | ✅ |
Priority areas:
- Shloka data — Help verify and add all 700 verses
- Sanskrit phonetics accuracy — Review AI feedback for linguistic correctness
- Bhashini/Google TTS integration — Add authentic Sanskrit audio
- Hindi/regional translations — Tamil, Telugu, Marathi, Kannada
- Whisper integration — Better STT than Web Speech API
git checkout -b feature/your-feature
git commit -m "feat: your description"
git push origin feature/your-feature
# → Open Pull RequestCode conventions:
- Functional React components + hooks only
- All AI calls must have a fallback
- Never commit
.envor API keys - Validate at system boundaries (user input, API responses); trust internal code
Q: Why is it not called GitaGuru? A: VaakSiddhi works equally well across Hindi/Sanskrit/English speakers, reflects the spiritual depth of the project, and has no trademark conflicts.
Q: Does it work on iPhone? A: Safari on iOS doesn't support Web Speech API. Use Chrome on Android, or Chrome/Edge on desktop. Native iOS app is on the v3.0 roadmap.
Q: Is my voice stored anywhere? A: In Stage 1, audio is processed entirely in the browser — only the text transcript is sent to the AI. The blob URL for playback lives in memory only, cleared when you leave Results. Nothing is uploaded.
Q: The TTS doesn't sound very Sanskrit-like.
A: Correct — the browser's hi-IN voice is a Hindi model reading Devanagari. It's better than IAST (which sounds like English), but not perfect. The Pronunciation Audio Upgrade Path section above explains how to add Bhashini (free) or Google sa-IN for authentic Sanskrit voice.
Q: How accurate is the AI feedback? A: Highly accurate for common errors (vowel length, retroflex consonants, visarga). For very subtle sandhi rules or Vedic pitch accents, always cross-reference with a human guru.
Q: Can I use this for other Sanskrit texts? A: The AI system works for any Sanskrit text. Swap the shloka database. Yoga Sutras, Upanishads, and Hanuman Chalisa are on the v3.0 roadmap.
MIT — free to use, fork, modify, and distribute.
- Bhagavad Gita translations: Swami Prabhupada (ISKCON), Swami Sivananda (DLS), Winthrop Sargeant
- Sanskrit ASR research: AI4Bharat team, IIT Madras Speech Lab
- Bhashini project: MeitY India — making Indian language AI publicly accessible
- Design inspiration: Indian manuscript illumination tradition
ॐ तत् सत् "That is the Truth"
योगः कर्मसु कौशलम् — Yoga is skill in action. (Bhagavad Gita 2.50)