Let's be honest: the job market for early grads is really tough right now. Just landing an interview invite feels like a massive win. But when you finally get on that call and freeze up, it’s heartbreaking; all that hard work goes out the window because of nerves or lack of practice.
That’s why I built RoundZero. It’s a live AI platform that lets you "interview before it counts." Powered by Gemini Live API and Google GenAI SDK, it talks to you in real-time just like a human interviewer would. It gives you a safe space to practice your technical and behavioral skills, so you can walk into the real thing feeling ready and confident.
Core Components
RoundZero consists of two core components:
React Frontend: Handles real-time audio and video capture, screen sharing coordination, low latency audio playback, voice activity detection for AI interruptions, live transcription display, interview phase control, dynamic system prompt generation, and a responsive UI.
Python Backend: Manages high speed WebSocket proxying, secure Google API authentication, real-time evaluation signals, candidate scoring, feedback generation, historical data aggregation, REST based session management, and environment configuration.
| Component | Core Technologies | Key Libraries & Frameworks |
|---|---|---|
| Frontend | React 19, TypeScript, Vite | Radix UI, React Router, Web Audio API, WebSocket APIs |
| Backend | Python 3 | AIOHTTP, WebSockets, Google GenAI SDK (Gemini), Prometheus |
| Deployment | Docker, Docker Compose, Cloud Run, Google Container Registry, GithubActions | - |
Architecture
The architecture of RoundZero is intentionally split into two distinct channels to balance human-like interactivity with deep analytical assessment.
1. Live Interactive Channel (WebSocket + Proxy)
This channel handles the "talking" part of the interview. To achieve very low latency, we use a Proxy Architecture.
The Handshake: When a session starts, the browser opens a WebSocket to the Python backend. The backend intercepts the first message, injects the Gemini credentials and Model ID, and then opens a second, secure WebSocket to the actual Gemini Live API servers.
Bi-directional Stream: Once the "bridge" is built, the backend acts as a high-speed pipe. It forwards the binary audio chunks (PCM) and screen frames directly to Gemini, and streams the AI's audio response back to the speakers immediately.
Real-time Transcripts: Since
input_audio_transcriptionis enabled in the Live API setup, Gemini sends back text events alongside the audio. The frontend uses these events to trigger the second part of the architecture: the assessment.

2. Assessment Architecture (Iterative Evaluation Architecture)
While the WebSocket handles the conversation, a separate REST-based architecture handles the evaluation. This ensures that the heavy AI thinking required for scoring doesn't slow down the live voice.
A. Signal Generation (Pre-Interview)
Before the conversation begins, the system uses the Gemini standard API to "prime" the interview. Based on the chosen question and difficulty, it generates a list of Evaluation Signals: concise competencies like "System Design Trade-offs" or "Star-method behavioral responses."
B. Score Snapshots (During Interview)
The platform performs asynchronous scoring:
- As the candidate speaks, the Live Channel sends text transcripts to the frontend.
- The frontend periodically sends these snippets to the backend's
/signals/evaluateendpoint. - A separate Gemini instance analyzes the text against the specific signals and returns a "snapshot" (scores 0-10 + notes).
- These are stored in the server's Score History, tracking the performance progression in real-time.
C. Holistic Synthesis (Post-Interview)
When the session ends, the backend reviews the entire "trajectory" of the scores. It uses a final Gemini prompt to look at the whole history, noting where the candidate improved or where the candidate struggled, to generate the final feedback report.

By separating these two, RoundZero can "think" deeply about the performance without the awkward pauses typical of most AI voice bots.

Deployment Architecture
- GitHub Actions: Automates the CI/CD pipeline for building and deploying both services.
- Docker: Containerizes each component for environment consistency.
- Google Container Registry (GCR): Version-controlled storage for production images.
- Google Cloud Run: Managed serverless hosting that scales on demand.
- Automated Handshake: CI pipeline injects the production backend URL into the frontend build.

What it does
- Live Interactive Engine: Low-latency, multimodal conversations with a very low latency response time.
- AI Interruption: Naturally interrupt the AI at any point during its response, just like in a real human-to-human interview.
- Noise Suppression: Voice Activity Detection (VAD) ensures the AI only responds to your speech, filtering out background noise.
- Multimodal Feedback: Share your screen for technical problems or your camera for behavioral presence analysis.
- Signal Scoring & Assessment: In-depth evaluation against industry rubrics for signals like Communication and Scalability.
- Interview Configuration: Customize your practice sessions with specific roles, difficulty levels, and assessment criteria.
- Intuitive UI: A sleek, distraction-free interface designed to help you focus on the interview experience.
- Real-Time Transcriptions: Instant on-screen text for Every word spoken, indexed for post-interview review.
- Robust Persistence: Smart reconnecting logic maintains your session state during network interruptions.
What's next for RoundZero
- Candidate Dashboard: Track growth with performance charts and identify consistent weak points across sessions.
- Scenario Library: A Netflix-style library of role-specific paths and company-themed drills.
- Assessment Playback: Synchronized transcript audio playback and an "AI Internal Monologue" column to see why the AI asked certain questions.
- Resume-to-Interview Pipeline: Upload your resume to generate hyper-personalized interviews based on your real past experience.
- AI Persona Customization: Tune the interviewer to be a "Friendly Mentor," "Academic Giant," or "Tech Savvy Senior Engineer."
Built With
- cloud-run
- gemini-live-api
- githubactions
- google-antigravity
- google-genai
- prometheus
- python
- react
- typescript

Log in or sign up for Devpost to join the conversation.