Inspiration
Past: interview prep meant flashcards, guesswork, and folklore (“make eye contact”). Present: LLM tools exist, but most are generic and ignore the actual job. Future: Voxtant turns a real job post into a live, private interview coach—measuring what you say, how you structure it, and how you deliver it, all in real time.
What it does
Reads the job: Paste a posting (URL or text). Voxtant extracts skills, responsibilities, and values into a compact “job graph.”
Plans the interview: Generates role-specific behavioral/technical questions with rubrics tied to that job.
Coaches live: Mock interview with real-time feedback on content coverage, STAR structure, and delivery (pace, pauses, fillers) plus on-device EQ (gaze stability, blink rate, expression variance).
Grades & guides: Returns scores and actionable tips (e.g., “add a metric to your Result,” “name the tool a team cares about”) and two exemplar rewrites for your weakest area.
How we built it
Frontend: Next.js + TypeScript + Tailwind + shadcn/ui. Web Speech API fallback for STT. EQ overlay with MediaPipe Face Landmarker + lightweight OpenCV.js smoothing (all in-browser, no video leaves the device).
Backend (FastAPI):
Ingestion with Trafilatura/Readability to clean job pages.
Requirement/skills extraction with spaCy + MiniLM embeddings; small curated skills map for reliability.
Grading API: content similarity (transcript vs. job graph), STAR heuristics (S/T/A/R triggers), and delivery metrics.
LLM (Gemini) used for plan/rubrics and optional answer rewrites; deterministic fallbacks for hackathon demos.
Demo resilience: “Demo Mode” returns stable outputs even on flaky Wi-Fi. CORS/env hardening, clear health checks, and route smoke tests.
Challenges we ran into
Real-world scraping: Aggregators block bots; we designed multi-path ingestion (Paste → ATS parsers) while keeping it ethical.
Latency & turn-taking: We separated live interviewer audio flow from local STT/EQ so one hiccup doesn’t freeze the other.
Measuring STAR, not just keywords: Built hybrid scoring—lightweight heuristics + embeddings—to respect different speaking styles.
Accomplishments that we're proud of
Job-aware coaching that actually aligns answers to the posting—no more generic prep.
Privacy-first EQ: Face landmarks and delivery metrics computed on-device; only aggregate numbers are used.
Judge-proof demo: Works reliably with or without LLM keys, thanks to fallbacks and a deterministic mode.
Clear UX: A focused flow—Paste → Plan → Interview → Grade—with meters, chips, and a clean “instrument panel” aesthetic.
What we learned
Small taxonomies go far: A compact skills map + embeddings survives messy job posts better than rules alone.
Privacy is a feature: Users relax—and perform better—when they know their video never leaves the browser.
Instrument your demo: Health checks, stable samples, and error-tolerant endpoints turn a risky live demo into a smooth one.
What's next for Voxtant
Live duplex interviewer (Gemini Live) with smarter turn-taking and follow-ups.
Chrome “Capture JD” extension to safely pull visible job text from any page.
Recruiter export: crisp ATS-ready bullet points and a coverage summary recruiters can skim in 30 seconds.
Accessibility modes: keyboard-first flow, reduced camera feedback, and dysfluency-aware coaching.
Deeper delivery analytics: topic coverage maps, answer-length guidance, and calibrated scoring against company values.
Built With
- css
- fastapi
- google-gemini-2.5-flash
- mediapipe
- next.js
- opencv
- pydantic
- python
- tailwind
- typescript
Log in or sign up for Devpost to join the conversation.