Inspiration
I wanted a voice + gesture interface that felt like commanding the human fleet in Ender's Game!
What it does
Eigenriver is a game that lets you control your starfleet in real-time using natural voice commands and hand gestures. Think wake word → speech-to-intent → instant scene changes, camera moves, and object manipulation.
How I built it
Faster-Whisper for STT, GPT OSS 120B on Cerebras for structured output, Mediapipe for gestures, LangGraph for multi-turn command handling, and Three.js for the 3D layer.
Challenges I ran into
Making voice + hands feel natural, not clunky. Balancing latency vs accuracy. Getting large models to behave in a predictable, structured way.
Accomplishments that I'm proud of
A real-time, low-latency control loop that feels fun to use. Smooth hand tracking. The first iteration of a command grammar that could scale way beyond a hackathon demo.
What we learned
How to make multi-modal input actually feel good. The trade-offs between model size, inference speed, and UX. That structured outputs save you a ton of pain downstream.
What's next for Eigenriver
Richer gestures, more complex scene logic, co-op mode, full 3D movement instead of just the existing 2D grid movement, and support for more devices!
Also, I own eigenriver.com, so I'll be deploying there soon!
Built With
- cerebras
- langgraph
- mediapipe
- openai
- python
Log in or sign up for Devpost to join the conversation.