Generative Manim

🎨 GPT-4o powered generative videos. Concept. ⚡️ Join our Discord server here!

🚀 Concept

Generative Manim (GM) is a suite of tools that allows you to create videos with Manim using LLMs (Large Language Models) like GPT-4 or Claude. The idea is to enable anyone to create wonderful animations from text ✨.

It began as a prototype of a web app that uses GPT-4 to generate videos with Manim. The idea behind this project is taking advantage of the power of LLMs in programming, the understanding of human language and the animation capabilities of Manim to generate a tool that could be used by anyone to create videos. Regardless of their programming or video editing skills.

🖐️ Generative Manim Demo: Check out the demo of Generative Manim!
🔬 Generative Manim API: Build over the Animation Processing Interface, or API.
🧑‍💻 Generative Manim Developers: Join our Discord server, learn new things, share your creations and more!
🍎 Generative Manim Streamlit (Legacy): First LLM exploration of LLMs and Animation.

💻 Models

Models are the core of Generative Manim. A model is a way to convert text to code, that can later be rendered in a video.

Name	Description	Engine	Phase
GM GPT-4o	Latest GPT model from OpenAI powered by a custom System Prompt	GPT-4o	✅
GM GPT-3.5 Fine Tuned	First Fine-tuned model of GPT-3.5	GPT-3.5	✅
GM GPT-3.5 Physics Fine Tuned	Fine-tuned GPT-3.5 model trained to generate Physics animations	GPT-3.5	✅
GM Claude Sonnet	Claude Sonnet 3 model from Sonnet adapted with our custom System Prompt	claude-3-sonnet-20240229	✅
GM Claude Sonnet 3.5	Claude Sonnet 3.5 model from Sonnet adapted with our custom System Prompt	claude-3-5-sonnet-20240620	✅
GM Qwen 2.5 Coder 7B	Open-source model fine-tuned with SFT + DPO + GRPO pipeline	Qwen2.5-Coder-7B-Instruct	🚧
GM DeepSeek Coder V2 Lite	Open-source model fine-tuned with SFT + DPO + GRPO pipeline	DeepSeek-Coder-V2-Lite	🚧
GM CodeLlama 7B	Open-source model fine-tuned with SFT + DPO + GRPO pipeline	CodeLlama-7b-Instruct	🚧

📡 New Models

If you want to suggest a new model, please open an issue in the repository or talk with us in our Discord server.

🧠 Training Pipeline

We're training open-source models to generate Manim code using a 3-stage pipeline that distills from GPT-4o:

SFT (Supervised Fine-Tuning) — Train on 5,000+ validated prompt→code pairs
DPO (Direct Preference Optimization) — Learn from render success/failure pairs
GRPO (Group Relative Policy Optimization) — RL with the Manim renderer as a deterministic reward signal

The key insight: Manim is a deterministic verifier — code either renders or crashes. This replaces the need for a reward model, similar to how DeepSeek-R1 uses math answer checkers.

Base models: Qwen 2.5 Coder 7B, DeepSeek Coder V2 Lite, CodeLlama 7B — all using QLoRA (4-bit) to fit on free Kaggle T4 GPUs.

📏 Benchmark

Generative Manim now includes an executable benchmark MVP for expert Manim code generation under training/benchmarks.

The benchmark is built around the right primitives for programming evaluation:

a frozen task suite
render-based scoring
Manim-specific structural checks
pass@k for stochastic code generation
reproducible JSONL and JSON reports

Start here:

cd training
python -m benchmarks.run export \
  --suite benchmarks/tasks/core_v1.jsonl \
  --output ./outputs/benchmarks/core_v1_prompts.jsonl

Then use the generated prompt file with python -m eval.generate_responses ..., or run the full flow with:

bash ./scripts/run_benchmark.sh qwen2.5-coder-7b ./outputs/grpo/qwen2.5-coder-7b benchmarks/tasks/core_v1.jsonl grpo 5 0.8 1,5

See training/benchmarks/README.md for the benchmark design and workflow.

Once you have multiple benchmark runs, compare them with:

cd training
python -m benchmarks.compare --results-dir ./outputs/benchmarks --suite core_v1

Or run a whole benchmark matrix from a manifest:

cd training
python -m benchmarks.matrix --manifest benchmarks/manifests/open_source_core_v1.json --dry-run

✨ Sponsors

Generative Manim is currently sponsored by The Astronomical Software Company.

🤲 Contributing

Generative Manim is an open source project.

If you want to be the author of a new feature, fix a bug or contribute with something new.

Fork the repository and make changes as you like. Pull requests are warmly welcome. Remember you can also join our Discord server to discuss new features, bugs or any other topic.

Name		Name	Last commit message	Last commit date
Latest commit History 298 Commits
.devcontainer		.devcontainer
.github		.github
animo		animo
api		api
datasets		datasets
experiments		experiments
streamlit		streamlit
training		training
.editorconfig		.editorconfig
.gitignore		.gitignore
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
Dockerfile		Dockerfile
GenScene.mp4		GenScene.mp4
GenScene.py		GenScene.py
LICENSE		LICENSE
README.md		README.md
README_CN.md		README_CN.md
docs.py		docs.py
main.py		main.py
packages.txt		packages.txt
pyproject.toml		pyproject.toml
run.py		run.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Generative Manim

🚀 Concept

💻 Models

📡 New Models

🧠 Training Pipeline

📏 Benchmark

✨ Sponsors

🤲 Contributing

About

Uh oh!

Releases 2

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Generative Manim

🚀 Concept

💻 Models

📡 New Models

🧠 Training Pipeline

📏 Benchmark

✨ Sponsors

🤲 Contributing

About

Topics

Resources

License

Code of conduct

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages