Multimodal Video Generation Model

A production-grade foundation model deployable via API, on-prem, or open weights.

//

Our Vision

We believe creative systems should be reliable enough for real production, flexible enough to adapt to different workflows, and open enough to improve through use. This is why we’re building an open, production-grade system for cinematic, synchronized audio–video creation.

//

Announcement

LTX-2.3 and What We Built on It

LTX 2.3 is a major engine upgrade: sharper detail, stronger motion, cleaner audio, and native portrait up to 1080×1920. It’s a production-ready multimodal engine - built to be built on. LTX Desktop is our first implementation, running fully locally.

This engine is designed to be built on. We built on it first. Learn More →

//

Open Source

Open Source,
by Design

An open source video generation model built to evolve in the real world. Excellence doesn’t happen in isolation. It emerges through real use, iteration, and collaboration. Open source is how we raise the bar — together.

Open by Default

Model weights, code, and core tooling are openly available for inception, extension, and reuse.

Excellence Through Collaboration

Improvement driven through real-world usage, iterative experimentation, and community driven collaboration.

//

Built on LTX

LTX Desktop.
Running on Your Machine.

A full video editor built on the LTX-2.3 engine, running locally on your hardware. Open weights, no cloud dependency, released as open software. Run it locally or integrate the engine via API

//

The LTX Stack

Build, Create, and Scale with LTX

Production-grade video generation models designed to hold up under real workloads. Built for long sequences, precise motion, and high-fidelity output  from fast iteration to final-quality renders. Learn More →

Native Portrait

Generate vertical video up to 1080×1920 — trained on portrait-orientation data, not cropped from landscape.

Audio to Video

Generate video where voice, music, and sound effects define structure, pacing, and motion.Built for production-grade workflows that require precise, harmonious control over audio-led scenes - from podcasts and avatars to voice-driven clips -not one-off demos or talking heads.

20 sec Clip

Extend creative range with long-form generation. Produce up to 20 seconds of high-fidelity video with complete control and consistent style.

Native 4K 50 FPS

Generate cinematic-grade video with synchronized audio at true 4K / 50 fps. Built for professional workflows, ready for studio, developer, or enterprise production.

//

API

Access the Full Power of LTX-2 through an API Built for Production

//

Customer Voices

Success, Engineered Together

"For professional studios, this level of control is not optional.
Training and steering video models like LTX is the most viable way to align AI with real production needs, where predictability, ownership, and creative intent matter as much as visual quality"
Mohamed Oumoumad
CTO, Gear Productions