A Discord-style, voice-first interface for interacting with multiple Cursor AI agents in parallel. Replace tab-based chat UIs with a single collaborative call.
Note: This project is currently in active development. Features may be incomplete or subject to change.
Codecall is a VS Code/Cursor extension that reimagines how developers interact with AI coding assistants. Instead of managing multiple chat tabs, Codecall presents a unified "video call" interface where:
- Each AI agent appears as a tile in a grid layout
- Agents work on tasks in parallel and report back via voice
- You speak to agents using push-to-talk
- Agents queue up to speak, one at a time, like a real meeting
- Spawn multiple agents - Create AI agents that work on different tasks simultaneously
- Visual status indicators - See at a glance which agents are idle, listening, working, or reporting
- Live output streaming - Watch agent output stream in real-time as captions on each tile
- File tracking - Track which files each agent reads and modifies
- Push-to-talk input - Speak naturally to give agents instructions
- Text-to-speech output - Agents summarize their work and speak it back to you
- Speaking queue - Agents take turns speaking; you control who goes next
- Multiple voice presets - Choose from professional, friendly, technical, calm, or energetic voices
- Auto-open files - When an agent speaks, the files it modified automatically open in the editor so you can follow along
- Cursor CLI integration - Spawns agents via the Cursor headless CLI
- ElevenLabs voice services - High-quality text-to-speech and speech-to-text
- Real-time streaming - JSON streaming for live progress updates
- VS Code or Cursor editor
- Node.js v18 or higher
- Cursor CLI (
agentcommand available) - API Keys:
- Cursor API Key - For headless agent operations
- ElevenLabs API Key - For voice features (TTS/STT)
git clone https://github.com/yourusername/codecall.git
cd codecallnpm installCopy the example environment file and fill in your API keys:
cp .env.example .envEdit .env with your values:
# Required
CURSOR_API_KEY=your_cursor_api_key_here
ELEVENLABS_API_KEY=your_elevenlabs_api_key_here
# Optional
ELEVENLABS_AGENT_ID=your_agent_id_here
PORT=3000npm run compileIn a separate terminal:
npm run serverOr with auto-reload during development:
npm run server:watch- Open the project in VS Code/Cursor
- Press
F5to launch the Extension Development Host - The Codecall panel will appear in the sidebar
This section provides detailed instructions for running the extension in debug mode during development.
-
Open the project in VS Code or Cursor:
code "/home/areg_/Code Ubuntu/codecall" # or cursor "/home/areg_/Code Ubuntu/codecall"
-
Install dependencies (if not already done):
npm install
-
Start the backend server in a terminal:
npm run server
-
Launch the Extension Development Host:
- Press
F5(or go to Run > Start Debugging) - This will:
- Automatically compile the extension via the
npm run watchtask - Open a new VS Code/Cursor window (Extension Development Host)
- Load the extension in development mode
- Automatically compile the extension via the
- Press
-
Access Codecall:
- In the new window, click the Codecall icon in the Activity Bar (left sidebar)
- The Codecall panel will open
For active development with hot-reloading:
Terminal 1 - Watch for extension changes:
npm run watchTerminal 2 - Backend server with auto-reload:
npm run server:watchTerminal 3 (Optional) - Full dev environment:
npm run devThen press F5 to launch the Extension Development Host. Changes to TypeScript files will be automatically recompiled.
- Breakpoints: Set breakpoints in
.tsfiles undersrc/. The debugger will pause at these points. - Debug Console: View
console.logoutput from the extension in the Debug Console (View > Debug Console). - Webview DevTools: Right-click the Codecall sidebar panel and select "Developer: Open Webview Developer Tools" to debug the React UI.
- Reload Extension: Press
Ctrl+Shift+F5(orCmd+Shift+F5on Mac) to reload the Extension Development Host after making changes.
The debug configuration is defined in .vscode/launch.json:
{
"name": "Run Extension",
"type": "extensionHost",
"request": "launch",
"args": ["--extensionDevelopmentPath=${workspaceFolder}"],
"outFiles": ["${workspaceFolder}/out/**/*.js"],
"preLaunchTask": "${defaultBuildTask}"
}| Issue | Solution |
|---|---|
| Extension doesn't appear | Make sure npm run compile completed without errors |
| "Cannot find module" errors | Run npm install again |
| Webview is blank | Check the Debug Console for errors; ensure the server is running |
| API calls failing | Verify your .env file has valid API keys |
| Changes not reflecting | Reload the Extension Host with Ctrl+Shift+F5 |
You can also configure API keys through VS Code/Cursor settings:
| Setting | Description |
|---|---|
codecall.cursorApiKey |
API key for Cursor agent CLI |
codecall.elevenLabsApiKey |
API key for ElevenLabs voice services |
codecall.elevenLabsAgentId |
ElevenLabs Conversational AI Agent ID (optional) |
codecall.defaultVoicePreset |
Default voice for new agents (professional, friendly, technical, calm, energetic) |
- Open the Codecall sidebar panel
- Click "Spawn Agent" or run
Codecall: Spawn Agentfrom the command palette - Enter a task prompt (e.g., "Refactor the authentication module")
- The agent appears as a new tile and begins working
- Single-click an agent to select it for voice input
- Double-click a working agent to interrupt it (puts it in listening mode)
- Dismiss an agent when you're done with it
- Hold the push-to-talk button to speak to the selected agent
- When an agent finishes a task, it generates a summary and queues to speak
- Click "Allow to Speak" on a queued agent to let it go next
- Agents speak one at a time to avoid overlapping audio
┌─────────────────────────────────────────────────────────────┐
│ VS Code Extension │
├─────────────────────────────────────────────────────────────┤
│ extension.ts │ Webview Provider, message routing │
│ agentManager.ts │ Cursor CLI process management │
│ voiceManager.ts │ ElevenLabs TTS/STT integration │
├─────────────────────────────────────────────────────────────┤
│ Webview UI (React) │
│ App.tsx │ Main chat interface │
│ components/ │ UI components │
│ hooks/ │ Voice interaction hooks │
└─────────────────────────────────────────────────────────────┘
│ │
▼ ▼
┌──────────────────┐ ┌──────────────────┐
│ Cursor CLI │ │ Hono Server │
│ (agent -p) │ │ (server.ts) │
│ │ │ │
│ • Stream JSON │ │ • REST API │
│ • File ops │ │ • TTS/STT │
│ • Tool calls │ │ • Agent state │
└──────────────────┘ └──────────────────┘
| File | Purpose |
|---|---|
src/extension.ts |
Extension entry point, webview provider |
src/agentManager.ts |
Manages Cursor CLI agent processes |
src/voiceManager.ts |
Handles ElevenLabs voice services |
server.ts |
Hono-based API server for voice and agent operations |
src/webview-ui/ |
React-based sidebar UI |
# Build everything
npm run compile
# Watch mode (extension + webviews)
npm run watch
# Run the backend server
npm run server
# Run server with auto-reload
npm run server:watch
# Lint the codebase
npm run lint
# Run tests
npm testcodecall/
├── src/
│ ├── extension.ts # Extension entry point
│ ├── agentManager.ts # Agent lifecycle management
│ ├── voiceManager.ts # Voice services
│ └── webview-ui/
│ └── sidebar/ # React UI components
├── server.ts # Backend API server
├── package.json # Extension manifest
└── .env.example # Environment template
The server exposes the following REST endpoints:
| Endpoint | Method | Description |
|---|---|---|
/api/health |
GET | Health check |
/api/agents |
GET | List all agents |
/api/agents/spawn |
POST | Spawn a new agent |
/api/agents/:id |
DELETE | Dismiss an agent |
/api/agents/:id/interrupt |
POST | Interrupt a working agent |
/api/agents/:id/message |
POST | Send follow-up message |
/api/speaking-queue |
GET | Get speaking queue status |
/api/voice/tts |
POST | Generate TTS audio |
/api/voice/scribe-token |
GET | Get speech-to-text token |
/api/voice/presets |
GET | List voice presets |
- Basic agent spawning and management
- Cursor CLI integration with streaming output
- ElevenLabs TTS integration
- Speaking queue management
- File tracking (modified/read files per agent)
- Auto-open files when agent reports
- Push-to-talk speech input (STT)
- Agent tile grid UI with waveforms
- Screen sharing for agents (code walkthrough)
- Persistent conversation history
- Multi-workspace support
Contributions are welcome! Please feel free to submit issues and pull requests.
MIT
- Cursor for the AI-powered editor and CLI
- ElevenLabs for voice synthesis
- Vercel AI SDK for AI integrations
