Voice File Agent

A voice-controlled file management agent that lets you manage your files using natural language voice commands. Built with LangGraph, OpenAI, and ElevenLabs.

Features

Voice input for natural language commands
Intelligent file management using LangGraph's ReAct agent
Voice feedback using ElevenLabs
Supports common file operations (e.g read files, copy, delete...)

Architecture

The agent follows a modular pipeline:

Voice Input – Audio is captured from your microphone
Transcription – Audio is converted to text using OpenAI's gpt-4o-mini-transcribe
LangGraph Agent – The prebuilt React agent interprets the command
File Tools – One of the tools is selected to perform the action
Text Response – The agent generates a natural language reply
Voice Output – The response is spoken using ElevenLabs

Getting Started

Installation

Clone the repository:

git clone https://github.com/your-username/voice-file-agent.git
cd voice-file-agent

Install dependencies using Poetry:

poetry install

Create a .env file in the project root:

OPENAI_API_KEY="your-openai-api-key"
ELEVENLABS_API_KEY="your-elevenlabs-api-key"

Usage

Start the agent:

poetry run python main.py

Wait for the welcome message:

╭─────────────────── Welcome ───────────────────╮
│ 🎙️ Voice Agent is ready! Press Ctrl+C to exit. │
╰───────────────────────────────────────────────╯

Speak your command when prompted. For example:
- "List all files in the current directory"
- "Create a new file called notes.txt"
- "Read the contents of config.json"
- "Move file.txt to the backup folder"
- "Delete old_document.pdf"
Press Enter to stop recording your command.
The agent will:
- Process your command
- Show the transcribed text
- Execute the requested file operation
- Speak back the result
To exit the agent, press Ctrl+C.

Example Commands

Here are some example voice commands you can try:

"Show me what's in this folder"
"Create a new file called todo.txt with the text 'Buy groceries'"
"Read the contents of config.json"
"Copy important.pdf to the backup folder"
"Move old_document.txt to the archive folder"
"Delete temporary.txt"
"Search for all PDF files in this directory"

Configuration

The agent's behavior can be customized by modifying core/config.py:

Voice settings (stability, similarity, style)
Sample rate for audio recording
Voice ID for ElevenLabs
Model settings
CLI theme and colors

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
core		core
static		static
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
main.py		main.py
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Voice File Agent

Features

Architecture

Getting Started

Installation

Usage

Example Commands

Configuration

Contributing

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Voice File Agent

Features

Architecture

Getting Started

Installation

Usage

Example Commands

Configuration

Contributing

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages