Playwright

Inspiration

Filmmaking is one of the most powerful storytelling mediums, but it’s also one of the most expensive and logistically complex. Directors often have vivid ideas in their heads, yet communicating tone, pacing, shot composition, and emotion to producers and collaborators can be difficult, especially in early pre-production.

We were inspired by the gap between imagination and execution. What if directors could instantly prototype their scenes the way developers prototype apps? What if a script could become a visual and auditory experience within minutes?

Playwright was born from the idea of giving storytellers a sandbox: a way to test scenes, experiment with pacing, and communicate creative intent clearly before investing time and money into real-world production.


What it does

Playwright is an AI-powered beat generator that transforms a written scene into a multimedia storyboard experience.

Here’s how it works:

  1. A user submits a script of a scene.
  2. We utilize Databricks for lightweight experiment tracking and analytics utility for a scene segmentation pipeline.
  3. We use the Gemini API to segment the script into narrative beats.
  4. Each beat is sent into our Stable Diffusion pipeline to generate visual frames.
  5. In parallel, we generate:
    • Narration for the scene
    • Background music to match tone and pacing
  6. We present the user with a structured breakdown of:
    • Script beats
    • Generated images
    • Audio components
  7. We provide the option for the user to export their images into Figma. This allows for:
    • Collaboration
    • Visual Markups
    • Rearrangement
    • Annotations

What makes Playwright unique is iteration. Users can reprompt specific beats to refine visuals, adjust tone, or correct drift from the director’s vision. Instead of regenerating everything, creators can surgically modify parts of a scene.

The result is a fast, low-cost way for directors to prototype scenes, pitch ideas to producers, and experiment creatively before going into production.


How we built it

Playwright is built as a modular AI pipeline.

1. Script Segmentation

  • We use the Gemini API to analyze the input script.
  • It breaks the scene into structured beats based on dialogue, action shifts, and emotional transitions.
  • The output is formatted into prompt-ready segments.

2. Visual Generation

  • Each beat is transformed into a detailed visual prompt.
  • We use Stable Diffusion to generate images representing each narrative moment.
  • We engineered prompt templates to maintain visual consistency across beats.

3. Audio Generation

  • Narration is generated from the script using text-to-speech.
  • Background music is generated or selected to match the emotional tone of each beat.
  • Audio is layered to simulate a rough cinematic preview.

4. Interactive Editing Layer

  • Users can reprompt individual beats.
  • We maintain state to avoid full regeneration.
  • This allows directors to iterate like they would with a creative team.

5. Figma Export

  • We developed a Figma plugin using the Figma API.
  • Using Figma make, we created a custom storyboarding template to be able to view generated images.

The system is designed to be extensible, so each component such as segmentation, visuals, and audio can be improved independently.


Challenges we ran into

Beat Segmentation Accuracy

Not all scripts are structured clearly. We had to refine prompting strategies to ensure Gemini produced meaningful narrative beats instead of arbitrary text chunks.

Visual Consistency Across Beats

Stable Diffusion can drift in character appearance, lighting, or style between frames. Maintaining continuity required careful prompt engineering and structured context passing.

Tone Matching Between Modalities

Ensuring that the visuals, narration, and background music aligned emotionally was more challenging than we anticipated. A dramatic scene paired with overly cheerful music immediately breaks immersion. We also had to dynamically balance the background music with the narration, carefully adjusting levels so users could clearly understand the voiceover while still experiencing the intended atmosphere and sound design.

Iterative Regeneration Without Breaking Flow

Allowing users to reprompt specific beats while preserving context required architectural decisions around state management and caching.

Compute Resources and Infrastructure

Running multimodal inference across segmentation, image generation, and audio synthesis required careful resource management. Stable Diffusion is GPU intensive, so we optimized batching, caching, and model loading to keep latency reasonable. On the API side, managing request limits and token usage for script segmentation required prompt efficiency and safeguards to avoid unnecessary calls. Balancing performance, scalability, and cost was a constant architectural challenge.


Accomplishments that we’re proud of

  • Turning a raw script into a structured audiovisual experience in minutes.
  • Creating a workflow that feels closer to directing than prompting.
  • Enabling partial regeneration instead of forcing full scene rebuilds.
  • Building a tool that has real-world practical value for filmmakers on tight budgets.

Most importantly, we built something that does not just generate content. It supports creative decision-making and helps storytellers communicate their vision more clearly.


What we learned

  • AI is powerful, but creative control is everything. Tools must enhance the creator, not override them.
  • Prompt engineering is closer to directing than programming.
  • Iteration speed is more important than perfect first outputs.

We also learned that filmmakers do not just want generation. They want collaboration. That insight shaped our reprompting and refinement system.


What’s next for Playwright

We see Playwright evolving into a full pre-production suite for filmmakers.

Next steps include:

  • Character consistency systems with persistent embeddings
  • Camera control with shot types, lens simulation, and framing options
  • A timeline-based editing interface
  • Exportable animatics
  • Multi-scene story arc management
  • Collaboration tools for directors and producers

Long term, we envision Playwright becoming a creative bridge that allows directors to communicate ideas clearly, test scenes affordably, and bring stronger concepts into real-world production.

Built With

Share this project:

Updates