🎬 Talk-to-Edit: AI-Powered Video Editing

Inspiration

Editing videos is powerful but painful. Traditional editors require scrubbing through timelines, hunting clips, and stacking effects.

We asked: what if editing was as easy as telling your computer what to do?

That’s how this project was born — inspired by Cursor, but reimagined for video editing.
Instead of cut() and drag-and-drop, you just say:

“Trim the part where the guy speaking is silent.”

…and it happens.

⚙️ How We Built It

Backend (FastAPI + TwelveLabs + FFmpeg): Handles video/audio processing, trims clips, adds effects, and generates previews.
NLP (Cohere): Parses natural language into structured editing commands.
Video Search (Twelve Labs): Finds key moments (like “LeBron silent” or “when the person dies”).
Frontend: A minimal Cursor-like UI with chat-driven commands, video preview, and instant feedback.
Dev Environment (Windsurf): Used Windsurf’s AI-first IDE to rapidly prototype, refactor, and debug the stack under hackathon time pressure.

Workflow:
User Command → Cohere (parse intent) → Twelve Labs (find moment) → Executor (FFmpeg) → Preview Video

yaml Copy code

📚 What We Learned

How much AI-powered development environments like Windsurf accelerate building — almost like pair-programming with a senior engineer.
How multimodal AI (language + video) can completely reshape creative tools.
The trade-offs between real-time performance vs. hackathon prototyping.
Why ruthless scoping matters: better to demo 3 magical features than 10 broken ones.
Even for creative tools, structured pipelines (NLP → search → execution) simplify everything.

🚧 Challenges We Faced

Latency: Running Cohere + server + Twelve Labs was heavy → solved with mocks + pre-indexed demos.
Parsing ambiguity: Natural language is messy → solved with careful Cohere prompting + heuristics.
Media handling: Syncing audio overlays with video timelines → FFmpeg saved us.
Time pressure: 30 hours forced razor-thin scope.

🚀 What’s Next

Expand effect libraries (glitch, transitions, auto-cuts).
Real-time collaboration — multiple users editing via chat.
Distributed video rendering for scaling beyond demos.

💡 Final Thought

We set out to answer one question:

What if video editing was as simple as talking to your video?

This project is our first step toward that future.

🛠️ Built With

Cohere
FastAPI
FFmpeg
Twelve Labs
Windsurf
HTML, CSS, JavaScript, Python

Name		Name	Last commit message	Last commit date
Latest commit History 35 Commits
__pycache__		__pycache__
backend		backend
detected_faces		detected_faces
htn		htn
public		public
venv		venv
.gitignore		.gitignore
README.md		README.md
boom_effect.mp3		boom_effect.mp3
debug_search.py		debug_search.py
main.py		main.py
object.py		object.py
output_converted1.mp4		output_converted1.mp4
output_converted1_edited.mp4		output_converted1_edited.mp4
output_converted1_fixed.mp4		output_converted1_fixed.mp4
output_utils.py		output_utils.py
package-lock.json		package-lock.json
package.json		package.json
requirements.txt		requirements.txt
sound_effects_info.md		sound_effects_info.md
specific.py		specific.py
start.sh		start.sh
utils-examples.py		utils-examples.py
utils.py		utils.py
vague.py		vague.py
vague_or_nah.py		vague_or_nah.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🎬 Talk-to-Edit: AI-Powered Video Editing

Inspiration

⚙️ How We Built It

📚 What We Learned

🚧 Challenges We Faced

🚀 What’s Next

💡 Final Thought

🛠️ Built With

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🎬 Talk-to-Edit: AI-Powered Video Editing

Inspiration

⚙️ How We Built It

📚 What We Learned

🚧 Challenges We Faced

🚀 What’s Next

💡 Final Thought

🛠️ Built With

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages