GitHub - felixbrock/voxpaste: Your voice is the fastest interface to AI

Your voice is the fastest interface to AI.

Voxpaste is a lightweight CLI tool with multi-provider support that turns your voice into text and drops it straight into your clipboard. Choose from Mistral, OpenAI, Groq, Deepgram, or OpenRouter.

Stop typing. Start speaking.

Two Modes for Different Workflows

Raw Mode (default) — One-to-one transcription with the absolute lowest latency. Perfect for pasting into LLMs like Claude, ChatGPT, or Cursor that can handle natural speech patterns. No processing overhead, just pure speed.

voxpaste

Clean Mode (--clean) — AI-powered cleanup that removes filler words, repetitions, and noise. Still extremely low latency with only marginally higher cost (two API calls instead of one). Ideal for drafting emails, documentation, or any context where you need polished text ready to use without LLM assistance.

voxpaste --clean

Why Voxpaste?

Two modes, two workflows — Raw for LLMs, clean for everything else
Multiple providers — Choose from Mistral, OpenAI, Groq, Deepgram, or OpenRouter—not locked into a single service
Blazing fast — Sub-second transcription latency with optimized providers
Zero friction — Record → transcribe → clipboard, all in one command
Built for AI workflows — Designed for developers who talk to AI all day. Bind it to a hotkey and invoke it from anywhere
Privacy-conscious — Your audio goes directly to your chosen provider, no middlemen

Prerequisites

Python 3.11 or higher
uv (recommended) or pip
An API key from one of the supported providers:
- Mistral (default)
- OpenAI
- Groq
- Deepgram
- OpenRouter

System dependencies for audio recording and clipboard:

Linux:

# Debian/Ubuntu
sudo apt install libportaudio2 xclip

# Arch Linux
sudo pacman -S portaudio xclip

# Fedora
sudo dnf install portaudio xclip

macOS:

# Install Homebrew if not already installed: https://brew.sh
brew install portaudio

Note: macOS has pbcopy built-in for clipboard support (no additional installation needed).

Installation

Install globally with uv (recommended)

uv tool install git+https://github.com/felixbrock/voxpaste.git

Or from a local clone:

git clone https://github.com/felixbrock/voxpaste.git
cd voxpaste
uv tool install .

This makes the voxpaste command available system-wide.

Install globally with pipx

pipx install git+https://github.com/felixbrock/voxpaste.git

Install with pip

pip install git+https://github.com/felixbrock/voxpaste.git

Or from a local clone:

git clone https://github.com/felixbrock/voxpaste.git
cd voxpaste
pip install .

Note: Using uv or pipx is recommended for CLI tools as they create isolated environments and avoid conflicts with system packages.

Configuration

Choosing a Provider

Voxpaste supports multiple speech-to-text providers. Set VOXPASTE_PROVIDER to choose one:

Provider	Value	Default Model	Notes
Mistral	`mistral` (default)	voxtral-mini-latest	Best latency, good accuracy
Groq	`groq`	whisper-large-v3	Best latency, generous free tier
OpenAI	`openai`	whisper-1	Most widely used, higher latency
Deepgram	`deepgram`	nova-2	Real-time focused, higher latency
OpenRouter	`openrouter`	mistralai/voxtral-small-24b-2507	Access to multiple models via unified API

Recommended: Use Mistral or Groq for the fastest transcription experience.

Fallback Provider

You can configure an automatic fallback STT provider with VOXPASTE_FALLBACK_PROVIDER. If the primary provider fails after all built-in retries, Voxpaste will resend the same recorded audio to the fallback provider instead of discarding the recording.

VOXPASTE_PROVIDER=mistral
VOXPASTE_FALLBACK_PROVIDER=groq

The fallback provider must be different from the primary provider and must have its corresponding API key configured. When fallback transcription succeeds, Voxpaste also raises a best-effort system notification so you know a backup provider was used.

Choosing a Model

You can customize which model to use for each provider by setting the corresponding environment variable:

Provider	Environment Variable	Default Value	Other Options
Mistral	`MISTRAL_MODEL`	`voxtral-mini-latest`	Check Mistral docs for available models
OpenAI	`OPENAI_MODEL`	`whisper-1`	Currently only `whisper-1` is available
Groq	`GROQ_MODEL`	`whisper-large-v3`	`whisper-large-v3-turbo`, `distil-whisper-large-v3-en`
Deepgram	`DEEPGRAM_MODEL`	`nova-2`	`nova`, `enhanced`, `base` - see Deepgram docs
OpenRouter	`OPENROUTER_MODEL`	`mistralai/voxtral-small-24b-2507`	`google/gemini-2.5-flash`, `google/gemini-3-flash-preview` - see audio models

If not specified, the default model for each provider will be used.

Setting up your API key

Configuration is stored in ~/.config/voxpaste/.env. See .env.example for a complete configuration template.

Create the config directory:
```
mkdir -p ~/.config/voxpaste
```

Create the environment file with your provider and API key:

# For Mistral (default)
echo "MISTRAL_API_KEY=your-api-key-here" > ~/.config/voxpaste/.env

# For OpenAI
echo "VOXPASTE_PROVIDER=openai" > ~/.config/voxpaste/.env
echo "OPENAI_API_KEY=your-api-key-here" >> ~/.config/voxpaste/.env

# For Groq
echo "VOXPASTE_PROVIDER=groq" > ~/.config/voxpaste/.env
echo "GROQ_API_KEY=your-api-key-here" >> ~/.config/voxpaste/.env

# For Deepgram
echo "VOXPASTE_PROVIDER=deepgram" > ~/.config/voxpaste/.env
echo "DEEPGRAM_API_KEY=your-api-key-here" >> ~/.config/voxpaste/.env

# For OpenRouter
echo "VOXPASTE_PROVIDER=openrouter" > ~/.config/voxpaste/.env
echo "OPENROUTER_API_KEY=your-api-key-here" >> ~/.config/voxpaste/.env

Optional: Customize the model for your chosen provider:

# Example: Use a different Groq model
echo "GROQ_MODEL=whisper-large-v3-turbo" >> ~/.config/voxpaste/.env

# Example: Use a different Deepgram model
echo "DEEPGRAM_MODEL=nova" >> ~/.config/voxpaste/.env

# Example: Use a different OpenRouter model
echo "OPENROUTER_MODEL=google/gemini-2.5-flash" >> ~/.config/voxpaste/.env

Secure the file:
```
chmod 600 ~/.config/voxpaste/.env
```

Usage

Basic Usage

Simply run:

voxpaste

The tool starts recording from your default microphone
Speak your instructions
Press Enter to stop recording
The audio is sent to your configured provider for transcription
If that provider fails after retries and a fallback is configured, the same audio is retried with the fallback provider
The transcription is printed and copied to your clipboard

The last transcription is also saved to ~/.cache/voxpaste/last_transcription.txt.

Cleaning Transcriptions

Use the --clean flag to automatically clean up your transcriptions using an LLM. This removes filler words (um, uh, like), repetitions, and noise while preserving the original meaning:

voxpaste --clean

By default, cleaning uses the same provider as your STT transcription. To customize the cleaning provider and model, configure them in ~/.config/voxpaste/.env:

# Use Groq for fast, cost-effective cleaning
VOXPASTE_CLEANING_PROVIDER=groq
GROQ_CLEANING_MODEL=llama-3.3-70b-versatile

# Or use OpenRouter with Claude for high-quality cleaning
VOXPASTE_CLEANING_PROVIDER=openrouter
OPENROUTER_CLEANING_MODEL=anthropic/claude-3.5-sonnet

# Or use OpenAI
VOXPASTE_CLEANING_PROVIDER=openai
OPENAI_CLEANING_MODEL=gpt-4o-mini

Available cleaning providers: mistral, openai, groq, openrouter

See .env.example for all cleaning configuration options and default models.

Pro tip: Bind to a global hotkey

For the best experience, bind voxpaste to a system-wide keyboard shortcut so you can trigger it from anywhere—no terminal needed.

Linux (GNOME):

Settings → Keyboard → Keyboard Shortcuts → Custom Shortcuts → Add:

Name: Voxpaste
Command: voxpaste
Shortcut: e.g., Super+Shift+V

Linux (KDE):

System Settings → Shortcuts → Custom Shortcuts → Edit → New → Global Shortcut → Command/URL

macOS:

Use Automator to create a Quick Action that runs voxpaste, then assign a shortcut in System Settings → Keyboard → Keyboard Shortcuts → Services.

Alternatively, tools like Raycast, Alfred, or Hammerspoon can bind shell commands to hotkeys.

Troubleshooting

No audio input detected:

Make sure your microphone is connected and set as the default input device
Linux: Check with pactl list sources to see available audio sources
macOS: Check System Settings > Sound > Input

Clipboard not working:

Linux: Install xclip or xsel (the tool tries both)
macOS: Uses pbcopy which is built-in (should work automatically)

API key not found:

Verify the file exists: cat ~/.config/voxpaste/.env
Make sure there are no extra spaces around the = sign
Ensure you have the correct API key variable for your provider (MISTRAL_API_KEY, OPENAI_API_KEY, GROQ_API_KEY, DEEPGRAM_API_KEY, or OPENROUTER_API_KEY)

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
.github/workflows		.github/workflows
images		images
.env.example		.env.example
.gitignore		.gitignore
.python-version		.python-version
LICENSE		LICENSE
README.md		README.md
main.py		main.py
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Two Modes for Different Workflows

Why Voxpaste?

Prerequisites

Installation

Install globally with uv (recommended)

Install globally with pipx

Install with pip

Configuration

Choosing a Provider

Fallback Provider

Choosing a Model

Setting up your API key

Usage

Basic Usage

Cleaning Transcriptions

Pro tip: Bind to a global hotkey

Troubleshooting

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Languages

Folders and files

Latest commit

History

Repository files navigation

Two Modes for Different Workflows

Why Voxpaste?

Prerequisites

Installation

Install globally with uv (recommended)

Install globally with pipx

Install with pip

Configuration

Choosing a Provider

Fallback Provider

Choosing a Model

Setting up your API key

Usage

Basic Usage

Cleaning Transcriptions

Pro tip: Bind to a global hotkey

Troubleshooting

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Languages

Packages