Inspiration

My main inspiration is Tony Stark building & visualizing things with Jarvis just by voice. Thus, it inspires me to try making a live agent that you could interact with voice.

What it does

Leo essentially helps you to develop your story, visualize its characters, and also prepare for shooting/animating by generating storyboard references. You can make your own project style and it will adapt to your liking.

How we built it

Powered with Gemini Live API, the model can call tools that's declared in the web app, which made with React and Hono. We also use MongoDB to store the data. Of course, we integrated Google Cloud services for authentication & model gateway.

Challenges we ran into

Mainly tool call accuracy, where sometimes it ran the wrong tools, resulting in invalid action or values.

Accomplishments that we're proud of

We're proud of actually worked on this project and reach to a MVP state. Of course we're also proud of the agent that could be used just by voice and automatically does stuff.

What we learned

Definitely to optimize our time into building by using coding agents, since we're very busy IRL. We also learn to be more detailed in user experience when using Leo.

What's next for Leo Bot

We want to improve Leo to be more capable, natural, and much responsive. The current UI/UX is still rough, so definitely want to improve on that too. Otherwise, the idea is pretty great overall.

Built With

Share this project:

Updates