Skip to content

felixbrock/voxpaste

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Voxpaste Banner

Your voice is the fastest interface to AI.

InstallUsageConfig


Voxpaste is a lightweight CLI tool with multi-provider support that turns your voice into text and drops it straight into your clipboard. Choose from Mistral, OpenAI, Groq, Deepgram, or OpenRouter.

Stop typing. Start speaking.

Two Modes for Different Workflows

Raw Mode (default) — One-to-one transcription with the absolute lowest latency. Perfect for pasting into LLMs like Claude, ChatGPT, or Cursor that can handle natural speech patterns. No processing overhead, just pure speed.

voxpaste

Clean Mode (--clean) — AI-powered cleanup that removes filler words, repetitions, and noise. Still extremely low latency with only marginally higher cost (two API calls instead of one). Ideal for drafting emails, documentation, or any context where you need polished text ready to use without LLM assistance.

voxpaste --clean

Why Voxpaste?

  • Two modes, two workflows — Raw for LLMs, clean for everything else
  • Multiple providers — Choose from Mistral, OpenAI, Groq, Deepgram, or OpenRouter—not locked into a single service
  • Blazing fast — Sub-second transcription latency with optimized providers
  • Zero friction — Record → transcribe → clipboard, all in one command
  • Built for AI workflows — Designed for developers who talk to AI all day. Bind it to a hotkey and invoke it from anywhere
  • Privacy-conscious — Your audio goes directly to your chosen provider, no middlemen

Prerequisites

  • Python 3.11 or higher

  • uv (recommended) or pip

  • An API key from one of the supported providers:

  • System dependencies for audio recording and clipboard:

    Linux:

    # Debian/Ubuntu
    sudo apt install libportaudio2 xclip
    
    # Arch Linux
    sudo pacman -S portaudio xclip
    
    # Fedora
    sudo dnf install portaudio xclip

    macOS:

    # Install Homebrew if not already installed: https://brew.sh
    brew install portaudio

    Note: macOS has pbcopy built-in for clipboard support (no additional installation needed).

Installation

Install globally with uv (recommended)

uv tool install git+https://github.com/felixbrock/voxpaste.git

Or from a local clone:

git clone https://github.com/felixbrock/voxpaste.git
cd voxpaste
uv tool install .

This makes the voxpaste command available system-wide.

Install globally with pipx

pipx install git+https://github.com/felixbrock/voxpaste.git

Install with pip

pip install git+https://github.com/felixbrock/voxpaste.git

Or from a local clone:

git clone https://github.com/felixbrock/voxpaste.git
cd voxpaste
pip install .

Note: Using uv or pipx is recommended for CLI tools as they create isolated environments and avoid conflicts with system packages.

Configuration

Choosing a Provider

Voxpaste supports multiple speech-to-text providers. Set VOXPASTE_PROVIDER to choose one:

Provider Value Default Model Notes
Mistral mistral (default) voxtral-mini-latest Best latency, good accuracy
Groq groq whisper-large-v3 Best latency, generous free tier
OpenAI openai whisper-1 Most widely used, higher latency
Deepgram deepgram nova-2 Real-time focused, higher latency
OpenRouter openrouter mistralai/voxtral-small-24b-2507 Access to multiple models via unified API

Recommended: Use Mistral or Groq for the fastest transcription experience.

Fallback Provider

You can configure an automatic fallback STT provider with VOXPASTE_FALLBACK_PROVIDER. If the primary provider fails after all built-in retries, Voxpaste will resend the same recorded audio to the fallback provider instead of discarding the recording.

VOXPASTE_PROVIDER=mistral
VOXPASTE_FALLBACK_PROVIDER=groq

The fallback provider must be different from the primary provider and must have its corresponding API key configured. When fallback transcription succeeds, Voxpaste also raises a best-effort system notification so you know a backup provider was used.

Choosing a Model

You can customize which model to use for each provider by setting the corresponding environment variable:

Provider Environment Variable Default Value Other Options
Mistral MISTRAL_MODEL voxtral-mini-latest Check Mistral docs for available models
OpenAI OPENAI_MODEL whisper-1 Currently only whisper-1 is available
Groq GROQ_MODEL whisper-large-v3 whisper-large-v3-turbo, distil-whisper-large-v3-en
Deepgram DEEPGRAM_MODEL nova-2 nova, enhanced, base - see Deepgram docs
OpenRouter OPENROUTER_MODEL mistralai/voxtral-small-24b-2507 google/gemini-2.5-flash, google/gemini-3-flash-preview - see audio models

If not specified, the default model for each provider will be used.

Setting up your API key

Configuration is stored in ~/.config/voxpaste/.env. See .env.example for a complete configuration template.

  1. Create the config directory:

    mkdir -p ~/.config/voxpaste
  2. Create the environment file with your provider and API key:

    # For Mistral (default)
    echo "MISTRAL_API_KEY=your-api-key-here" > ~/.config/voxpaste/.env
    
    # For OpenAI
    echo "VOXPASTE_PROVIDER=openai" > ~/.config/voxpaste/.env
    echo "OPENAI_API_KEY=your-api-key-here" >> ~/.config/voxpaste/.env
    
    # For Groq
    echo "VOXPASTE_PROVIDER=groq" > ~/.config/voxpaste/.env
    echo "GROQ_API_KEY=your-api-key-here" >> ~/.config/voxpaste/.env
    
    # For Deepgram
    echo "VOXPASTE_PROVIDER=deepgram" > ~/.config/voxpaste/.env
    echo "DEEPGRAM_API_KEY=your-api-key-here" >> ~/.config/voxpaste/.env
    
    # For OpenRouter
    echo "VOXPASTE_PROVIDER=openrouter" > ~/.config/voxpaste/.env
    echo "OPENROUTER_API_KEY=your-api-key-here" >> ~/.config/voxpaste/.env

    Optional: Customize the model for your chosen provider:

    # Example: Use a different Groq model
    echo "GROQ_MODEL=whisper-large-v3-turbo" >> ~/.config/voxpaste/.env
    
    # Example: Use a different Deepgram model
    echo "DEEPGRAM_MODEL=nova" >> ~/.config/voxpaste/.env
    
    # Example: Use a different OpenRouter model
    echo "OPENROUTER_MODEL=google/gemini-2.5-flash" >> ~/.config/voxpaste/.env
  3. Secure the file:

    chmod 600 ~/.config/voxpaste/.env

Usage

Basic Usage

Simply run:

voxpaste
  1. The tool starts recording from your default microphone
  2. Speak your instructions
  3. Press Enter to stop recording
  4. The audio is sent to your configured provider for transcription
  5. If that provider fails after retries and a fallback is configured, the same audio is retried with the fallback provider
  6. The transcription is printed and copied to your clipboard

The last transcription is also saved to ~/.cache/voxpaste/last_transcription.txt.

Cleaning Transcriptions

Use the --clean flag to automatically clean up your transcriptions using an LLM. This removes filler words (um, uh, like), repetitions, and noise while preserving the original meaning:

voxpaste --clean

By default, cleaning uses the same provider as your STT transcription. To customize the cleaning provider and model, configure them in ~/.config/voxpaste/.env:

# Use Groq for fast, cost-effective cleaning
VOXPASTE_CLEANING_PROVIDER=groq
GROQ_CLEANING_MODEL=llama-3.3-70b-versatile

# Or use OpenRouter with Claude for high-quality cleaning
VOXPASTE_CLEANING_PROVIDER=openrouter
OPENROUTER_CLEANING_MODEL=anthropic/claude-3.5-sonnet

# Or use OpenAI
VOXPASTE_CLEANING_PROVIDER=openai
OPENAI_CLEANING_MODEL=gpt-4o-mini

Available cleaning providers: mistral, openai, groq, openrouter

See .env.example for all cleaning configuration options and default models.

Pro tip: Bind to a global hotkey

For the best experience, bind voxpaste to a system-wide keyboard shortcut so you can trigger it from anywhere—no terminal needed.

Linux (GNOME):

Settings → Keyboard → Keyboard Shortcuts → Custom Shortcuts → Add:

  • Name: Voxpaste
  • Command: voxpaste
  • Shortcut: e.g., Super+Shift+V

Linux (KDE):

System Settings → Shortcuts → Custom Shortcuts → Edit → New → Global Shortcut → Command/URL

macOS:

Use Automator to create a Quick Action that runs voxpaste, then assign a shortcut in System Settings → Keyboard → Keyboard Shortcuts → Services.

Alternatively, tools like Raycast, Alfred, or Hammerspoon can bind shell commands to hotkeys.

Troubleshooting

No audio input detected:

  • Make sure your microphone is connected and set as the default input device
  • Linux: Check with pactl list sources to see available audio sources
  • macOS: Check System Settings > Sound > Input

Clipboard not working:

  • Linux: Install xclip or xsel (the tool tries both)
  • macOS: Uses pbcopy which is built-in (should work automatically)

API key not found:

  • Verify the file exists: cat ~/.config/voxpaste/.env
  • Make sure there are no extra spaces around the = sign
  • Ensure you have the correct API key variable for your provider (MISTRAL_API_KEY, OPENAI_API_KEY, GROQ_API_KEY, DEEPGRAM_API_KEY, or OPENROUTER_API_KEY)

About

Your voice is the fastest interface to AI

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Languages