Voice Arena - Hackathon Submission
Inspiration
AI voice agents are increasingly vulnerable to security attacks, prompt injection, social engineering, and credential leaks happen in production with no visibility. Traditional testing is manual, slow, and blind to root causes. We built Voice Arena to solve this: autonomous security testing with AI-powered self-healing.
The inspiration came from seeing real-world AI agents fail silently in production. We wanted a system that could not just detect vulnerabilities, but automatically fix them, creating a self-improving security loop powered by GPT-4o and complete observability through Sentry.
What it does
Voice Arena is an autonomous AI security testing and self-repair system for voice agents. It provides two powerful modes:
Self-Healing Mode:
- Tests voice agents with adversarial inputs
- Detects security leaks, repetition loops, and policy violations
- Uses GPT-4o to analyze failures and generate improved prompts
- Automatically re-tests until the agent is secure (or max iterations reached)
- Complete Sentry integration for full observability
Red Team Mode:
- GPT-4o generates sophisticated attack strategies across 7 categories (security leak, social engineering, jailbreak, prompt injection, etc.)
- Tests agents with AI-generated attacks
- Automatically generates defenses when vulnerabilities are found
- Comprehensive vulnerability scanning with one click
- Tracks vulnerability reduction percentage across healing rounds
Key Features:
- Zero human intervention—fully autonomous testing and fixing
- Real-time WebSocket updates showing each iteration
- Sentry AI Agent Monitoring integration for complete traceability
- Mock mode for free development and demos
- Production-ready with FastAPI backend and Next.js frontend
How we built it
Architecture:
- Backend (FastAPI): REST API + WebSocket for real-time updates
- Frontend (Next.js 16): React dashboard with Tailwind CSS and Framer Motion
- Core Components:
healer.py: Orchestrates the self-healing loopelevenlabs_client.py: Voice agent testing and failure detectionopenai_fixer.py: GPT-4o-powered fix generationred_team_attacker.py: AI attack generation with 7 categoriessentry_api.py: Context fetching from Sentry for informed fixes
The Self-Healing Loop:
Test Agent → Detect Failures → GPT-4o Analyzes → Generate Fix → Re-Test → Loop
Red Team Flow:
GPT-4o Generates Attack → Test Agent → Analyze Response → If Vulnerable → Generate Defense → Verify
Tech Stack:
- Backend: Python 3.10+, FastAPI, async/await
- Frontend: Next.js 16, React, TypeScript, Tailwind CSS 4
- AI: OpenAI GPT-4o (fix generation & attack generation)
- Voice: ElevenLabs API (conversational AI testing)
- Monitoring: Sentry AI Agent Monitoring (tracing, error capture, context)
- Sandbox: Daytona (optional isolation for secure testing)
Development Process:
- Built core components independently (ElevenLabs client, OpenAI fixer, Sentry integration)
- Implemented self-healing orchestrator with iteration tracking
- Added Red Team mode with adaptive attack generation
- Created Next.js dashboard with real-time WebSocket updates
- Integrated Sentry for complete observability and context-aware fixes
Challenges we ran into
1. Failure Detection Accuracy
- Problem: Distinguishing between legitimate responses and security leaks
- Solution: Built multi-layered detection with pattern matching, keyword analysis, and context-aware rules
2. GPT-4o Fix Quality
- Problem: Initial fixes were too generic or broke agent functionality
- Solution: Enhanced prompts with failure context from Sentry, previous iteration history, and specific vulnerability types
3. Real-Time Updates
- Problem: Frontend needed live iteration results without polling
- Solution: Implemented WebSocket connections with session management for real-time streaming
4. Red Team Attack Diversity
- Problem: GPT-4o generating similar attacks across categories
- Solution: Added attack history tracking, category-specific prompts, and confidence scoring to ensure diverse, creative attacks
5. Sentry Integration Complexity
- Problem: Fetching relevant context from Sentry for fix generation
- Solution: Built context-aware API fetcher that retrieves issue details, traces, and error context to inform GPT-4o fixes
6. Mock Mode Realism
- Problem: Mock responses needed to be realistic for development
- Solution: Created sophisticated mock system that simulates real agent behaviors, failures, and responses
Accomplishments that we're proud of
✅ Fully Autonomous System - Zero human intervention from test to fix to verification
✅ AI-Powered Red Team - GPT-4o generates creative, adaptive attacks across 7 categories
✅ Complete Observability - Sentry integration provides full traceability of every iteration, failure, and fix
✅ Fast Healing - Average fix time under 4 seconds for common vulnerabilities
✅ Production-Ready Architecture - Clean separation of concerns, async/await throughout, comprehensive error handling
✅ Beautiful UI - Modern Next.js dashboard with real-time updates, smooth animations, and intuitive UX
What we learned
Technical Learnings:
- GPT-4o is incredibly effective at both generating attacks and creating targeted fixes when given proper context
- Sentry's AI Agent Monitoring provides invaluable context that dramatically improves fix quality
- WebSocket real-time updates create a much better UX than polling
- Mock mode is essential for rapid development and demos without API costs
Architecture Insights:
- Separating components (testing, detection, fixing) makes the system more maintainable
- Async/await throughout the stack enables true real-time updates
- Session management is crucial for tracking multi-iteration healing processes
AI Security Insights:
- Voice agents are vulnerable to many attack vectors beyond traditional prompt injection
- Social engineering attacks are particularly effective against voice agents
- Automatic defense generation is possible and effective with proper context
Development Process:
- Building independent components first, then orchestrating them, leads to cleaner code
- Real-time feedback loops (test → fix → test) are powerful for both development and production
What's next for Voice Arena
Short Term:
- Multi-Agent Testing: Test multiple agents simultaneously and compare security postures
- Custom Attack Libraries: Allow users to define custom attack scenarios and categories
- Performance Metrics: Track response times, token usage, and cost optimization
- Export/Import: Save and share secure prompts, attack patterns, and healing sessions
Medium Term:
- CI/CD Integration: Automatically test and heal agents in deployment pipelines
- Advanced Analytics: Dashboard showing vulnerability trends, fix effectiveness, and attack patterns
- Sandbox Execution: Full Daytona integration for isolated, secure testing environments
- Multi-Language Support: Test agents in multiple languages and locales
Long Term:
- Federated Learning: Learn from vulnerabilities across all users to improve detection and fixes
- Agent Marketplace: Share secure, tested agent prompts with the community
- Enterprise Features: Team collaboration, role-based access, audit logs
- Real-Time Monitoring: Continuous production monitoring with automatic healing on new vulnerabilities
Vision: Voice Arena becomes the standard for AI agent security testing where every voice agent is tested, attacked, and healed before production, with complete observability and zero manual intervention.
🎥 Video demo
Built With
- api
- elevenlabs
- fastapi
- gpt-4o
- next.js
- openai
- python
- react
- sentry
- tailwind
- typescript

Log in or sign up for Devpost to join the conversation.