Inspiration
My main inspiration is Tony Stark building & visualizing things with Jarvis just by voice. Thus, it inspires me to try making a live agent that you could interact with voice.
What it does
Leo essentially helps you to develop your story, visualize its characters, and also prepare for shooting/animating by generating storyboard references. You can make your own project style and it will adapt to your liking.
How we built it
Powered with Gemini Live API, the model can call tools that's declared in the web app, which made with React and Hono. We also use MongoDB to store the data. Of course, we integrated Google Cloud services for authentication & model gateway.
Challenges we ran into
Mainly tool call accuracy, where sometimes it ran the wrong tools, resulting in invalid action or values.
Accomplishments that we're proud of
We're proud of actually worked on this project and reach to a MVP state. Of course we're also proud of the agent that could be used just by voice and automatically does stuff.
What we learned
Definitely to optimize our time into building by using coding agents, since we're very busy IRL. We also learn to be more detailed in user experience when using Leo.
What's next for Leo Bot
We want to improve Leo to be more capable, natural, and much responsive. The current UI/UX is still rough, so definitely want to improve on that too. Otherwise, the idea is pretty great overall.
Log in or sign up for Devpost to join the conversation.