Voice input for AI workflows on Linux.
You're deep in a session with a coding agent. You know exactly what you want to ask — a complex refactor, a debugging question, a feature request. But now you have to type it all out.
By the time you're done, you've lost the thread.
Context switching kills flow. And typing at 40 WPM when you can speak at 150 WPM is a bottleneck you don't need.
Press a key. Speak. Press again. Paste.
hyprvox is a voice-to-text daemon for Linux. It runs in the background, transcribes when you need it, and puts the result on your clipboard — ready to paste into Claude, Copilot, or whatever agent you're working with.
Built for Hyprland/Wayland first. Works on X11 too.
# Install Bun (if not already installed)
curl -fsSL https://bun.sh/install | bash
# Install ffmpeg (required for Opus audio conversion)
# Arch: sudo pacman -S ffmpeg
# Ubuntu: sudo apt install ffmpeg
# Fedora: sudo dnf install ffmpeggit clone https://github.com/Snehit70/hyprvox.git
cd hyprvox
bun install
bun run index.ts config init # Set up API keys (Groq + Deepgram)
bun run index.ts install # Install as systemd servicePress Right Ctrl to record. Press again to stop. Paste anywhere.
Works on both Wayland and X11. On X11/GNOME/KDE, the built-in hotkey works out of the box. On Wayland (Hyprland, Sway), see compositor keybind setup for reliable system-wide hotkeys.
Click to expand setup prompt
Copy this prompt to your coding agent:
Install and configure hyprvox on this Linux system:
1. Clone: git clone https://github.com/Snehit70/hyprvox.git
2. Install: cd hyprvox && bun install
3. Run `bun run index.ts config init` — I'll provide API keys when prompted:
- Groq API key (get from console.groq.com)
- Deepgram API key (get from console.deepgram.com)
4. Install service: bun run index.ts install
5. For Hyprland, add keybind to ~/.config/hypr/hyprland.conf:
bind = , code:105, exec, bun run /path/to/hyprvox/index.ts toggle
# code:105 = Right Control (use `wev` to find other key codes)
6. For Hyprland overlay, add to ~/.config/hypr/UserConfigs/WindowRules.conf:
windowrule = match:class hyprvox-overlay, float on
windowrule = match:class hyprvox-overlay, pin on
windowrule = match:class hyprvox-overlay, no_focus on
windowrule = match:class hyprvox-overlay, no_shadow on
windowrule = match:class hyprvox-overlay, no_anim on
windowrule = match:class hyprvox-overlay, move ((monitor_w-window_w)*0.5) (monitor_h-window_h-50)
7. Reload: hyprctl reload
8. Verify: bun run index.ts health
Dual-engine transcription. Audio goes to both Groq (Whisper V3) and Deepgram (Nova-3) in parallel. Results are merged with an LLM for better accuracy. If one fails, the other continues.
Streaming or batch. ~500ms latency in streaming mode. Higher accuracy in batch mode. Your choice.
Runs as a daemon. Systemd service starts on login. Always ready when you need it.
| Metric | Value |
|---|---|
| Median latency | 882ms |
| Real-time factor | 39x faster than real-time |
| Dual-engine success | 93.5% |
| Filler words removed | 12.3% (by LLM cleanup) |
| LLM merge overhead | ~280ms |
The LLM doesn't just merge — it removes filler words ("um", "uh"), false starts, and self-corrections automatically.
A small waveform appears at the bottom of your screen while recording — visual feedback that it's listening.
For Hyprland, add these window rules:
# ~/.config/hypr/UserConfigs/WindowRules.conf
windowrule = match:class hyprvox-overlay, float on
windowrule = match:class hyprvox-overlay, pin on
windowrule = match:class hyprvox-overlay, no_focus on
windowrule = match:class hyprvox-overlay, no_shadow on
windowrule = match:class hyprvox-overlay, no_anim on
windowrule = match:class hyprvox-overlay, move ((monitor_w-window_w)*0.5) (monitor_h-window_h-50)
Click to expand
Audio — alsa-utils
- Arch:
sudo pacman -S alsa-utils - Ubuntu:
sudo apt install alsa-utils - Fedora:
sudo dnf install alsa-utils
Clipboard
- Wayland:
wl-clipboard - X11:
xcliporxsel
Permissions
sudo usermod -aG audio,input $USER
# Log out and back in| Provider | Purpose | Link |
|---|---|---|
| Groq | Whisper V3 (fast) | console.groq.com |
| Deepgram | Nova-3 (accurate) | console.deepgram.com |
Run bun run index.ts config init to set them up.
bun run index.ts status # Check daemon status
bun run index.ts health # Test system setup
bun run index.ts toggle # Start/stop recording
bun run index.ts history # View past transcriptions
bun run index.ts logs # Tail daemon logs
bun run index.ts errors # Show last error
bun run index.ts config init # Set up API keys
bun run index.ts boost add # Add custom vocabularyConfig file: ~/.config/hypr/vox/config.json
{
"apiKeys": { "groq": "...", "deepgram": "..." },
"transcription": {
"streaming": true,
"boostWords": ["Hyprland", "WebSocket", "refactor"]
}
}Streaming mode — ~500ms latency, slightly lower accuracy. Batch mode — 2-8 seconds, higher accuracy. Boost words — Improve recognition for technical terms.
Full options: Configuration Guide
Add keybind for global hotkey:
# ~/.config/hypr/hyprland.conf
bind = , code:105, exec, bun run /path/to/hyprvox/index.ts toggle
# code:105 = Right Control
Use wev | grep -A5 "key event" to find key codes.
This bypasses XWayland limitations.
Full guide: Wayland Support
| Problem | Fix |
|---|---|
| Hotkey not working | Add user to input group; use compositor binds on Wayland |
| No audio | Add user to audio group |
| Clipboard issues | Install wl-clipboard (Wayland) or xclip (X11) |
| Service won't start | Check logs: journalctl --user -u hyprvox -f |
Full guide: Troubleshooting
- Architecture — How it works under the hood
- Configuration — All options explained
- CLI Commands — Every command and flag
- Wayland Support — Platform-specific setup
- Use Conventional Commits on branches merged into
main;feat:triggers a minor bump andfix:triggers a patch bump. .github/workflows/release-please.ymlopens or updates the release PR, and.github/workflows/release.ymlpublishes tagged releases after tests pass.- Release Please uses
release-please-config.jsonand.release-please-manifest.jsonto track the root package version. - Set repository Actions permissions to
Read and write, and enableAllow GitHub Actions to create and approve pull requestsor provide aRELEASE_PLEASE_TOKENsecret with repo scope.
MIT
