hyprvox

Voice input for AI workflows on Linux.

The Problem

You're deep in a session with a coding agent. You know exactly what you want to ask — a complex refactor, a debugging question, a feature request. But now you have to type it all out.

By the time you're done, you've lost the thread.

Context switching kills flow. And typing at 40 WPM when you can speak at 150 WPM is a bottleneck you don't need.

The Solution

Press a key. Speak. Press again. Paste.

hyprvox is a voice-to-text daemon for Linux. It runs in the background, transcribes when you need it, and puts the result on your clipboard — ready to paste into Claude, Copilot, or whatever agent you're working with.

Built for Hyprland/Wayland first. Works on X11 too.

Quick Start

Prerequisites

# Install Bun (if not already installed)
curl -fsSL https://bun.sh/install | bash

# Install ffmpeg (required for Opus audio conversion)
# Arch:   sudo pacman -S ffmpeg
# Ubuntu: sudo apt install ffmpeg
# Fedora: sudo dnf install ffmpeg

Installation

git clone https://github.com/Snehit70/hyprvox.git
cd hyprvox
bun install

bun run index.ts config init   # Set up API keys (Groq + Deepgram)
bun run index.ts install       # Install as systemd service

Press Right Ctrl to record. Press again to stop. Paste anywhere.

Works on both Wayland and X11. On X11/GNOME/KDE, the built-in hotkey works out of the box. On Wayland (Hyprland, Sway), see compositor keybind setup for reliable system-wide hotkeys.

For AI Agents

Click to expand setup prompt

Copy this prompt to your coding agent:

Install and configure hyprvox on this Linux system:

1. Clone: git clone https://github.com/Snehit70/hyprvox.git
2. Install: cd hyprvox && bun install
3. Run `bun run index.ts config init` — I'll provide API keys when prompted:
   - Groq API key (get from console.groq.com)
   - Deepgram API key (get from console.deepgram.com)
4. Install service: bun run index.ts install
5. For Hyprland, add keybind to ~/.config/hypr/hyprland.conf:
    bind = , code:105, exec, bun run /path/to/hyprvox/index.ts toggle
    # code:105 = Right Control (use `wev` to find other key codes)
6. For Hyprland overlay, add to ~/.config/hypr/UserConfigs/WindowRules.conf:
    windowrule = match:class hyprvox-overlay, float on
    windowrule = match:class hyprvox-overlay, pin on
    windowrule = match:class hyprvox-overlay, no_focus on
    windowrule = match:class hyprvox-overlay, no_shadow on
    windowrule = match:class hyprvox-overlay, no_anim on
    windowrule = match:class hyprvox-overlay, move ((monitor_w-window_w)*0.5) (monitor_h-window_h-50)
7. Reload: hyprctl reload
8. Verify: bun run index.ts health

How It Works

Dual-engine transcription. Audio goes to both Groq (Whisper V3) and Deepgram (Nova-3) in parallel. Results are merged with an LLM for better accuracy. If one fails, the other continues.

Streaming or batch. ~500ms latency in streaming mode. Higher accuracy in batch mode. Your choice.

Runs as a daemon. Systemd service starts on login. Always ready when you need it.

Performance

Metric	Value
Median latency	882ms
Real-time factor	39x faster than real-time
Dual-engine success	93.5%
Filler words removed	12.3% (by LLM cleanup)
LLM merge overhead	~280ms

The LLM doesn't just merge — it removes filler words ("um", "uh"), false starts, and self-corrections automatically.

The Overlay

A small waveform appears at the bottom of your screen while recording — visual feedback that it's listening.

For Hyprland, add these window rules:

# ~/.config/hypr/UserConfigs/WindowRules.conf
windowrule = match:class hyprvox-overlay, float on
windowrule = match:class hyprvox-overlay, pin on
windowrule = match:class hyprvox-overlay, no_focus on
windowrule = match:class hyprvox-overlay, no_shadow on
windowrule = match:class hyprvox-overlay, no_anim on
windowrule = match:class hyprvox-overlay, move ((monitor_w-window_w)*0.5) (monitor_h-window_h-50)

Installation

Dependencies

Click to expand

Audio — alsa-utils

Arch: sudo pacman -S alsa-utils
Ubuntu: sudo apt install alsa-utils
Fedora: sudo dnf install alsa-utils

Clipboard

Wayland: wl-clipboard
X11: xclip or xsel

Permissions

sudo usermod -aG audio,input $USER
# Log out and back in

API Keys

Provider	Purpose	Link
Groq	Whisper V3 (fast)	console.groq.com
Deepgram	Nova-3 (accurate)	console.deepgram.com

Run bun run index.ts config init to set them up.

Usage

bun run index.ts status      # Check daemon status
bun run index.ts health      # Test system setup
bun run index.ts toggle      # Start/stop recording
bun run index.ts history     # View past transcriptions
bun run index.ts logs        # Tail daemon logs
bun run index.ts errors      # Show last error
bun run index.ts config init # Set up API keys
bun run index.ts boost add   # Add custom vocabulary

Configuration

Config file: ~/.config/hypr/vox/config.json

{
  "apiKeys": { "groq": "...", "deepgram": "..." },
  "transcription": {
    "streaming": true,
    "boostWords": ["Hyprland", "WebSocket", "refactor"]
  }
}

Streaming mode — ~500ms latency, slightly lower accuracy. Batch mode — 2-8 seconds, higher accuracy. Boost words — Improve recognition for technical terms.

Full options: Configuration Guide

Hyprland Setup

Add keybind for global hotkey:

# ~/.config/hypr/hyprland.conf
bind = , code:105, exec, bun run /path/to/hyprvox/index.ts toggle
# code:105 = Right Control

Use wev | grep -A5 "key event" to find key codes.

This bypasses XWayland limitations.

Full guide: Wayland Support

Troubleshooting

Problem	Fix
Hotkey not working	Add user to `input` group; use compositor binds on Wayland
No audio	Add user to `audio` group
Clipboard issues	Install `wl-clipboard` (Wayland) or `xclip` (X11)
Service won't start	Check logs: `journalctl --user -u hyprvox -f`

Full guide: Troubleshooting

Documentation

Architecture — How it works under the hood
Configuration — All options explained
CLI Commands — Every command and flag
Wayland Support — Platform-specific setup

Release Workflow

Use Conventional Commits on branches merged into main; feat: triggers a minor bump and fix: triggers a patch bump.
.github/workflows/release-please.yml opens or updates the release PR, and .github/workflows/release.yml publishes tagged releases after tests pass.
Release Please uses release-please-config.json and .release-please-manifest.json to track the root package version.
Set repository Actions permissions to Read and write, and enable Allow GitHub Actions to create and approve pull requests or provide a RELEASE_PLEASE_TOKEN secret with repo scope.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 272 Commits
.github/workflows		.github/workflows
assets		assets
docs		docs
overlay		overlay
scripts		scripts
src		src
tests		tests
.gitignore		.gitignore
.release-please-manifest.json		.release-please-manifest.json
AGENTS.md		AGENTS.md
CONTRIBUTING.md		CONTRIBUTING.md
ISSUES.md		ISSUES.md
PLAN-CONFIG-CONSOLIDATION.md		PLAN-CONFIG-CONSOLIDATION.md
PLAN.md		PLAN.md
PRD.md		PRD.md
README.md		README.md
STATS.md		STATS.md
biome.json		biome.json
bun.lock		bun.lock
config.example.json		config.example.json
config.example.jsonc		config.example.jsonc
index.ts		index.ts
package.json		package.json
progress.txt		progress.txt
release-please-config.json		release-please-config.json
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

hyprvox

The Problem

The Solution

Quick Start

Prerequisites

Installation

For AI Agents

How It Works

Performance

The Overlay

Installation

Dependencies

API Keys

Usage

Configuration

Hyprland Setup

Troubleshooting

Documentation

Release Workflow

License

About

Uh oh!

Releases 2

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

hyprvox

The Problem

The Solution

Quick Start

Prerequisites

Installation

For AI Agents

How It Works

Performance

The Overlay

Installation

Dependencies

API Keys

Usage

Configuration

Hyprland Setup

Troubleshooting

Documentation

Release Workflow

License

About

Resources

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages