Lucid

Inspiration

Our project is inspired by the educational content of 3Blue1Brown (3b1b), whose mathematics videos leverage the power of Manim, a library he developed. We were curious to explore whether a Large Language Model could autonomously generate similar quality content, merging Manim’s technical capabilities with the adaptive flexibility of AI-driven responses. The ambition was to push the boundaries of what AI could achieve in educational content creation, approaching the high standards of 3b1b’s work.

What it does

Our application generates videos from a simple prompt using Manim to produce animations and transitions that explain the prompt’s content. Each video includes a synchronized voiceover that narrates the entire sequence, creating a comprehensive learning experience.

How we built it

We designed a sophisticated backend to interact with the o1-preview LLM, using advanced prompt engineering to ensure output that aligns precisely with Manim’s syntax. The backend handles parsing, tokenization, and structuring of this output to generate animation code without visual overlaps or errors. This code is rendered into animations and then synced with a voiceover, creating a smooth educational video. The frontend, built with React, Next.js, and Tailwind CSS, delivers a seamless UI, allowing users to engage with content interactively through pop-ups, animated components, and accessible controls.

Challenges we ran into

The biggest challenge we ran into was getting the o1-preview LLM to generate a suitable Manim code that has all the correct renderings and no overlapping visuals. Since the LLM is a black-box the only thing we had control over was getting it the correct prompt with all the necessary examples and data sets to help it generate exactly what we want.

Accomplishments that we're proud of

We're proud of being able to get around our biggest challenge of generating a correctly rendered video from o1-preview LLM and syncing the voiceover with it. We're also proud of building a UI that enhances our user experience.

What we learned

Throughout this project, we gained deeper insights into prompt engineering for LLMs, learning how to optimize responses by carefully structuring input examples. We also enhanced our understanding of video rendering processes and animation libraries like Manim. Our experience reinforced the importance of meticulous timing and formatting in automated video production.

What's next for Lucid

If we had more time we could've implemented longer videos that get into the finer details of every concept. We would also like to further enhance its capabilities to render 3-dimensional videos.

Built With

fastapi
manim
nextjs
openai
python

Submitted to

HackRU Fall 2024
- Winner Education Track

Created by

I worked on the backend to optimize the pre-processing of the prompt to get the desired manim output from o1-preview. Also worked with manim code to get better video rendering and prevent overlaps.

Aditya Sharma
I developed the backend API using FastAPI to handle requests for video retrieval and deliver content efficiently to users. I also enhanced video quality by working with manim to improve the output generated from o1-preview.

Joffre Loor
I worked on the frontend using the Next.js framework to develop interactive and animated UI components, such as the input bar, motion-enhanced cards, and overall design. I also ensured that the frontend would link up properly with the backend.

Bobir Asatov
I worked on the backend to write the code for pre/post processing and AI generations. I also wrote optimizations for the same. I used Python, manim (community version), and OpenAI's API.

Utkarsh Dubey

Updates

Aditya Sharma started this project — Oct 27, 2024 09:51 AM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.