Inspiration
Our project is inspired by the educational content of 3Blue1Brown (3b1b), whose mathematics videos leverage the power of Manim, a library he developed. We were curious to explore whether a Large Language Model could autonomously generate similar quality content, merging Manim’s technical capabilities with the adaptive flexibility of AI-driven responses. The ambition was to push the boundaries of what AI could achieve in educational content creation, approaching the high standards of 3b1b’s work.
What it does
Our application generates videos from a simple prompt using Manim to produce animations and transitions that explain the prompt’s content. Each video includes a synchronized voiceover that narrates the entire sequence, creating a comprehensive learning experience.
How we built it
We designed a sophisticated backend to interact with the o1-preview LLM, using advanced prompt engineering to ensure output that aligns precisely with Manim’s syntax. The backend handles parsing, tokenization, and structuring of this output to generate animation code without visual overlaps or errors. This code is rendered into animations and then synced with a voiceover, creating a smooth educational video. The frontend, built with React, Next.js, and Tailwind CSS, delivers a seamless UI, allowing users to engage with content interactively through pop-ups, animated components, and accessible controls.
Challenges we ran into
The biggest challenge we ran into was getting the o1-preview LLM to generate a suitable Manim code that has all the correct renderings and no overlapping visuals. Since the LLM is a black-box the only thing we had control over was getting it the correct prompt with all the necessary examples and data sets to help it generate exactly what we want.
Accomplishments that we're proud of
We're proud of being able to get around our biggest challenge of generating a correctly rendered video from o1-preview LLM and syncing the voiceover with it. We're also proud of building a UI that enhances our user experience.
What we learned
Throughout this project, we gained deeper insights into prompt engineering for LLMs, learning how to optimize responses by carefully structuring input examples. We also enhanced our understanding of video rendering processes and animation libraries like Manim. Our experience reinforced the importance of meticulous timing and formatting in automated video production.
What's next for Lucid
If we had more time we could've implemented longer videos that get into the finer details of every concept. We would also like to further enhance its capabilities to render 3-dimensional videos.
Built With
- fastapi
- manim
- nextjs
- openai
- python
Log in or sign up for Devpost to join the conversation.