Skip to content

CodeBySonu95/VoxSherpa-TTS

Repository files navigation

VoxSherpa TTS Banner

Join Beta Support Android License Sherpa-ONNX Downloads

VoxSherpa TTS

Studio-quality offline neural text-to-speech for Android.
Hindi Β· English Β· British Β· Japanese Β· Chinese Β· and more β€” No cloud. No limits. No compromise.


πŸ† Featured In

VoxSherpa TTS is listed in the official README of k2-fsa/sherpa-onnx β€” the core inference library powering this app.

Sherpa-ONNX HuggingFace


Why VoxSherpa?

Most TTS apps make you choose between quality and privacy. Cloud-based tools like ElevenLabs sound incredible β€” but they require internet, send your text to remote servers, and charge per character.

VoxSherpa breaks that tradeoff.

It runs two professional-grade neural engines entirely on your device:

Engine Quality Speed Best For
🧠 Kokoro-82M Studio-grade · rivals ElevenLabs Slower on budget hardware Audiobooks, voiceovers, professional content
⚑ Piper / VITS Natural · clear Fast on any device Daily use, quick synthesis

Screenshots

Generate Models Library Settings

Features

πŸŽ™οΈ Dual Neural Engine

  • Kokoro-82M β€” 82 million parameter neural model. Multilingual support including Hindi, English, British English, French, Spanish, Chinese, Japanese and 50+ more languages. Same architecture used by top-tier commercial TTS services.
  • Piper / VITS β€” Fast, lightweight, natural. Generates speech in seconds on any Android device.

πŸ”’ 100% Offline & Private

  • All processing happens on your device
  • No internet required after model download
  • No account, no telemetry, no data collection
  • Your text never leaves your phone

πŸ“¦ Model Management

  • Download models directly from the app
  • Import your own .onnx models from local storage
  • Multiple models installed simultaneously
  • Smart storage tracking

🎧 Audio Controls

  • Real-time waveform visualization
  • Adjustable speed and pitch
  • Play, pause, and replay generated audio
  • Export as WAV with correct sample rate per model

πŸ“š Speech Library

  • Save all generated audio locally
  • Favorites system for quick access
  • View generation history with timestamps
  • Voice model attribution per recording

βš™οΈ Smart Settings

  • Smart Punctuation β€” natural pauses after sentence breaks
  • Emotion Tags β€” [whisper], [angry], [happy] support
  • Per-model voice selection (Kokoro supports 100+ speakers)
  • Theme-aware UI

Technical Architecture

User Text
    β”‚
    β”œβ”€β”€β”€ Kokoro Engine (KokoroEngine.java)
    β”‚         └── Sherpa-ONNX JNI β†’ ONNX Runtime β†’ CPU/NNAPI
    β”‚                   └── kokoro-multi-lang-v1_0 (82M params, FP32)
    β”‚
    └─── Piper / VITS Engine (VoiceEngine.java)
              └── Sherpa-ONNX JNI β†’ ONNX Runtime β†’ CPU
                        └── VITS model (language-specific)

Built with:

  • Sherpa-ONNX β€” on-device neural inference
  • Kokoro-82M β€” multilingual neural TTS model
  • Piper β€” fast local TTS
  • Android AudioTrack API β€” low-latency PCM playback

Performance

Generation speed depends entirely on your device's processor:

Device Tier Kokoro Piper
🟒 Flagship (Snapdragon 8 Gen 3) ~20–40 sec/min audio ~5 sec/min audio
🟑 Mid-range (8-core) ~60–90 sec/min audio ~10 sec/min audio
πŸ”΄ Budget (6-core) ~2–3 min/min audio ~20 sec/min audio

Kokoro prioritizes quality over speed by design. It uses the same 82M parameter architecture that powers premium commercial TTS β€” running it entirely offline on a mobile CPU is genuinely pushing the hardware limits.


Installation

πŸš€ Early Access (Production Review Pending)

Update: Thanks to the amazing support from this community, the 14-day closed testing is complete, and VoxSherpa TTS is currently under Production Review by Google Play! πŸŽ‰

While we wait for the app to go publicly live, you can still get Early Access to the stable V2.5 directly from the Play Store.

What's new in V2.5 (Stable):

  • πŸ”Š System-wide TTS engine β€” use VoxSherpa in any app (Chrome, WhatsApp, etc.)
  • πŸ“„ PDF to Audio
  • πŸ“‘ TXT to Audio
  • ✨ Interactive mini-player, smoother UI, and improved audio generation

How to join Early Access:

  1. Fill out the form below with your Google Play email.
  2. I will manually add you to the early access list.
  3. You will receive a direct Play Store link to install the app.

Join Early Access

Source code for V2.5 will be pushed to the GitHub Main branch once the production version is officially live on the Play Store.

Model Import (Technical Users)

VoxSherpa supports importing custom .onnx models without any server:

  1. Place your .onnx model + tokens.txt on device storage
  2. Open Models tab β†’ tap + β†’ Import Local Model
  3. Select your files

Compatible with any Sherpa-ONNX compatible TTS model.


Contributing

VoxSherpa is open source. Contributions welcome:

  • πŸ› Bug reports via Issues
  • πŸ’‘ Feature requests via Discussions
  • πŸ”§ Pull requests for fixes and improvements

License

Copyright (C) 2025 CodeBySonu95

This program is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.

This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.

https://www.gnu.org/licenses/gpl-3.0.html

Acknowledgements


Built with obsession. Runs without internet.

VoxSherpa β€” Because your voice deserves to stay yours.

About

πŸŽ™οΈ VoxSherpa TTS Offline Neural Text-to-Speech Engine for Android ⚑ Sherpa-ONNX powered πŸ”Š Natural voice synthesis πŸ“± Fully offline processing πŸš€ No cloud β€’ No limits

Topics

Resources

License

Stars

Watchers

Forks