<![CDATA[ThinkDiffusion]]>https://learn.thinkdiffusion.com/https://learn.thinkdiffusion.com/favicon.pngThinkDiffusionhttps://learn.thinkdiffusion.com/Ghost 6.22Fri, 13 Mar 2026 22:52:50 GMT60<![CDATA[Qwen Image Edit 2509: Combine Multiple Images Into One Scene for Fashion, Products, Poses & more]]>
Prompt: A woman in image 1 is riding a vintage in image 2. A green trash bin in image 3 is visible beside her.

Ever wanted to merge the perfect outfit with the ideal pose, or showcase different products all in a single, seamless scene? Whether you’re a

]]>
https://learn.thinkdiffusion.com/qwen-image-edit-2509-combine-multiple-images-into-one-scene-for-fashion-products-poses-more/68dd0e712dfdfd00012b0f63Tue, 21 Oct 2025 12:04:08 GMTQwen Image Edit 2509: Combine Multiple Images Into One Scene for Fashion, Products, Poses & more
Prompt: A woman in image 1 is riding a vintage in image 2. A green trash bin in image 3 is visible beside her.
Qwen Image Edit 2509: Combine Multiple Images Into One Scene for Fashion, Products, Poses & more

Ever wanted to merge the perfect outfit with the ideal pose, or showcase different products all in a single, seamless scene? Whether you’re a designer looking to preview fashion combinations, a marketer building composite product showcases, or an artist bringing creative concepts to life, this workflow gives you the power to experiment freely and see instant, consistent results.

What is Qwen Image Edit 2509? Why is it better?

Qwen Image Edit 2509: Combine Multiple Images Into One Scene for Fashion, Products, Poses & more
Source: Qwen Image

Qwen Image Edit 2509 lets you combine multiple images into a single scene using text prompts. Reference different images by number in your prompt-"image 1," "image 2," "image 3"-and the model places them together.

Released in September 2025 (the "2509" refers to the version and release date), this update handles complex multi-image edits better than the original Qwen. Faster processing, better alignment between elements, and more realistic results when combining different inputs.

Useful for product mockups, fashion previews, or any project where you need to merge different elements into one cohesive image.

Qwen Image Edit 2509: Combine Multiple Images Into One Scene for Fashion, Products, Poses & more
Prompt: A orange cat in image 1 and white dog in image 2 met each and greet other at the grassy place in image 3. They greet each other.

Why settle for one-at-a-time edits, when your next masterpiece could start with everything you need, all at once?

Download Workflow

Installation guide

  1. Download the workflow file
  2. Open ComfyUI (local or ThinkDiffusion)
  3. Drag the workflow file into the ComfyUI window
  4. If you see red nodes, install missing components:
  • ComfyUI Manager > Install Missing Custom Nodes

Verified to work on ThinkDiffusion Build: September 29, 2025

ComfyUI v0.3.60 with the use qwen_image_edit_2509_fp8_e4m3fn.safetensors
model support

Minimum Machine Size: Ultra

Use the specified machine size or higher to ensure it meets the VRAM and performance requirements of the workflow

💡
Download the workflow and drag & drop it into your ComfyUI window, whether locally or on ThinkDiffusion. If you're using ThinkDiffusion, minimum requirement is the Turbo 24gb machine, but we do recommend the Ultra 48gb machine.

Custom Nodes

If there are red nodes in the workflow, it means that the workflow lacks the certain required nodes. Install the custom nodes in order for the workflow to work.

  1. Go to the ComfyUI Manager  > Click Install Missing Custom Nodes
Qwen Image Edit 2509: Combine Multiple Images Into One Scene for Fashion, Products, Poses & more
  1. Check the list below if there's a list of custom nodes that needs to be installed and click the install.
Qwen Image Edit 2509: Combine Multiple Images Into One Scene for Fashion, Products, Poses & more

Required Models

For this guide you'll need to download these 4 recommended models.

1. qwen_image_edit_2509_fp8_e4m3fn.safetensors
2. Qwen-Image-Lightning-4steps-v1.0.safetensors
3. qwen_2.5_vl_7b_fp8_scaled.safetensors
4. qwen_image_vae.safetensors
  1. Go to ComfyUI Manager  > Click Model Manager
Qwen Image Edit 2509: Combine Multiple Images Into One Scene for Fashion, Products, Poses & more
  1. Search for the models above and when you find the exact model that you're looking for, click install, and make sure to press refresh when you are finished.
Qwen Image Edit 2509: Combine Multiple Images Into One Scene for Fashion, Products, Poses & more

If Model Manager doesn't have them: Use direct download links (included with workflow) and upload through ThinkDiffusion MyFiles > Upload URL. Refer our docs for more guidance on this.

You could also use the model path source instead: by pasting the model's link address into ThinkDiffusion MyFiles using upload URL.

Model Name Model Link Address ThinkDiffusion Upload Directory
qwen_image_edit_2509_fp8_e4m3fn.safetensors
📋 Copy Path
.../comfyui/models/diffusion_models/
Qwen-Image-Lightning-4steps-v1.0.safetensors
📋 Copy Path
.../comfyui/models/lora/
qwen_2.5_vl_7b_fp8_scaled.safetensors
📋 Copy Path
.../comfyui/models/text_encoders/
qwen_image_vae.safetensors
📋 Copy Path
.../comfyui/models/vae/

Step-by-step Workflow Guide

This workflow was pretty easy to set up and runs well from the default settings. Here are a few steps where you might want to take extra note.

Steps Recommended Nodes
1. Set the Models

Set the required models as seen in the image.
Qwen Image Edit 2509: Combine Multiple Images Into One Scene for Fashion, Products, Poses & more
2. Load the Input Image

Upload your input image. You can you either 1 up to 3 images only. It works even 1 input image only and just bypassed the other node.
Qwen Image Edit 2509: Combine Multiple Images Into One Scene for Fashion, Products, Poses & more
3. Write the Prompt

Write a detailed prompt. The should designate an image < number> in the prompt. You describe whatever you want and whatever action will be.
Qwen Image Edit 2509: Combine Multiple Images Into One Scene for Fashion, Products, Poses & more
4. Check the Sampling Settings

Check the sampling settings. If you want higher quality of output you can the full model but it needs a higher machine than Ultra. You can use the fp8 model instead.
Qwen Image Edit 2509: Combine Multiple Images Into One Scene for Fashion, Products, Poses & more
5. Check the Output Image

Qwen Image Edit 2509: Combine Multiple Images Into One Scene for Fashion, Products, Poses & more

Insights

💡
I can't generate new images from scratch, the model only lets me edit or modify images that I provide as input. If I try to make complex compositional changes like switching backgrounds, inserting several new objects, or making extensive additions, I often find that the model struggles with these tasks. 
💡
If my edits involve faces, personal data, explicit imagery, or copyrighted material, my requests might get denied or the renders might be incomplete. Sometimes, I run into limitations with supported file formats or maximum image dimensions.
💡
I'm aware that it can sometimes generate images with a "plasticky" or artificial AI look, especially in areas like skin, faces, or complex edits. This characteristic is a known limitation and has been discussed in the user community as an effect where the result lacks realism, often showing overly smooth textures, unnatural glossiness, or uniform surfaces that do not appear lifelike.

Examples

Qwen Image Edit 2509: Combine Multiple Images Into One Scene for Fashion, Products, Poses & more
Prompt: A woman in image 1 is holding the white in image 2. She sitting at the living room in image 3./
Qwen Image Edit 2509: Combine Multiple Images Into One Scene for Fashion, Products, Poses & more
Prompt: A toy robot in image 1, a toy crane in mage 2 and a shoe in image 3 are visible in the kitchen.
Qwen Image Edit 2509: Combine Multiple Images Into One Scene for Fashion, Products, Poses & more
Prompt: A man facing front in image 1 wears the dress in image 2. He is holding a basket of colorful eggs in image 3. The background is a city street.

Troubleshooting

Red Nodes: Install missing custom nodes through ComfyUI Manager
Out of Memory: Use smaller expansion factors or switch to Ultra machine
Poor Quality: Check input image resolution and adjust kontext strength
Visible Seams: Lower strength and ensure good prompt description

If you’re having issues with installation or slow hardware, you can try any of these workflows on a more powerful GPU in your browser with ThinkDiffusion.

Join the ThinkDiffusion Discord Server!
ThinkDiffusion is your Stable Diffusion workspace in the cloud with unrestricted, bleeding edge opensource AI art tools. | 5510 members
Qwen Image Edit 2509: Combine Multiple Images Into One Scene for Fashion, Products, Poses & more
]]>
<![CDATA[Flux Krea Dev: Photorealistic Portraits Without the AI Look - Workflow + Guide]]>
Prompt: A realistic person wearing a bright yellow raincoat stands smiling near the right edge of a wide, landscape-oriented frame. The background is a vibrant urban skatepark, with colorful ramps and graffiti, set after a rain—puddles reflecting bold street art and cloudy sky. The person has authentic skin
]]>
https://learn.thinkdiffusion.com/flux-krea-dev-photorealistic-portraits-without-the-ai-look-workflow-guide/6890a6a7863dc200011271e7Mon, 06 Oct 2025 11:48:46 GMTFlux Krea Dev: Photorealistic Portraits Without the AI Look - Workflow + Guide
Prompt: A realistic person wearing a bright yellow raincoat stands smiling near the right edge of a wide, landscape-oriented frame. The background is a vibrant urban skatepark, with colorful ramps and graffiti, set after a rain—puddles reflecting bold street art and cloudy sky. The person has authentic skin texture and natural facial features, fully clothed (no topless or naked appearance), and holds a skateboard with one hand. The words "Flux Krea" are clearly visible as part on the clothing. Lighting is soft and natural, with realistic shadows and reflections. No plasticky skin, no overly smooth surfaces, no AI artifacts, no uncanny valley. The composition is dynamic and wide, blending urban energy with lifelike detail. Highly detailed, organic, and human appearance
Flux Krea Dev: Photorealistic Portraits Without the AI Look - Workflow + Guide

Flux Krea Dev is built to create photorealistic images without the usual AI giveaways-no plastic skin, no overly smooth textures, no uncanny valley weirdness.

This is a collaboration between Black Forest Labs and Krea AI. It's a 12-billion parameter model that handles realistic skin tones, natural lighting, and follows prompts accurately. Works especially well for portraits and character shots.

Faster than standard Flux Dev and produces more varied, lifelike results.

What is Flux Krea Dev

Flux Krea Dev: Photorealistic Portraits Without the AI Look - Workflow + Guide
Source: Flux Krea Dev

Flux Krea Dev is designed to generate photorealistic images while avoiding common AI artifacts like plastic-looking skin or overprocessed textures. The 12-billion parameter model maintains realistic detail, handles nuanced lighting well, and produces natural skin tones.

It supports fine-tuning, works with existing Flux workflows, and generates images quickly. Good for portraits, character work, and any project where you need genuinely realistic human features.

Comparison of Flux Krea Dev with Standard model

Flux Krea Dev: Photorealistic Portraits Without the AI Look - Workflow + Guide
Prompt: A female assassin dances fluidly atop a city rooftop at night, her sleek, dark attire blending modern tactical gear with elegant, flowing elements. Neon lights from the city skyline reflect off her outfit as she moves with precision and grace, her silhouette striking against the urban backdrop. Seed - 521404981828559, Euler, Simple, Steps 25, CFG 1

Flux Krea Dev is a major upgrade from the original Flux Dev, designed to create genuinely lifelike images. Where Flux Dev often produces flat, repetitive, or generic results, Flux Krea Dev excels in capturing photorealistic detail, prompt accuracy, and expressive variety—making it the best open-source choice for anyone seeking high-quality, realistic AI-generated art and portraits.

Get ready for a hands-on experience designed to make your art leap off the screen and make people ask, “Is that really AI?”

Download Workflow

Installation guide

  1. Download the workflow file
  2. Open ComfyUI (local or ThinkDiffusion)
  3. Drag the workflow file into the ComfyUI window
  4. If you see red nodes, install missing components:
  • ComfyUI Manager > Install Missing Custom Nodes

Verified to work on ThinkDiffusion Build: June 27, 2025

ComfyUI v0.3.47 with the use flux1-krea-dev.safetensors

Note: We specify the build date because ComfyUI and custom node versions updated after this date may change the behavior or outputs of the workflow.

Minimum Machine Size: Ultra

Use the specified machine size or higher to ensure it meets the VRAM and performance requirements of the workflow

💡
Download the workflow and drag & drop it into your ComfyUI window, whether locally or on ThinkDiffusion. If you're using ThinkDiffusion, minimum requirement is the Turbo 24gb machine, but we do recommend the Ultra 48gb machine.

Custom Nodes

If there are red nodes in the workflow, it means that the workflow lacks the certain required nodes. Install the custom nodes in order for the workflow to work.

  1. Go to the ComfyUI Manager  > Click Install Missing Custom Nodes
Flux Krea Dev: Photorealistic Portraits Without the AI Look - Workflow + Guide
  1. Check the list below if there's a list of custom nodes that needs to be installed and click the install.
Flux Krea Dev: Photorealistic Portraits Without the AI Look - Workflow + Guide

Required Models

For this guide you'll need to download these 4 recommended models.

1. flux1-krea-dev.safetensors
2. t5xxl_fp32.safetensors
3. clip_l.safetensors
4. ae.safetensors
  1. Go to ComfyUI Manager  > Click Model Manager
Flux Krea Dev: Photorealistic Portraits Without the AI Look - Workflow + Guide
  1. Search for the models above and when you find the exact model that you're looking for, click install, and make sure to press refresh when you are finished.
Flux Krea Dev: Photorealistic Portraits Without the AI Look - Workflow + Guide

If Model Manager doesn't have them: Use direct download links (included with workflow) and upload through ThinkDiffusion MyFiles > Upload URL. Refer our docs for more guidance on this.

You could also use the model path source instead: by pasting the model's link address into ThinkDiffusion MyFiles using upload URL.

Model Name Model Link Address ThinkDiffusion Upload Directory
flux1-krea-dev.safetensors
📋 Copy Path
.../comfyui/models/diffusion_models/
t5xxl_fp32.safetensors
📋 Copy Path
.../comfyui/models/clip/
clip_l.safetensors
📋 Copy Path
.../comfyui/models/clip/
ae.safetensors
📋 Copy Path
.../comfyui/models/vae/
💡
If the flux1-krea-dev.safetensors is already a pre-loaded model, you don't need to upload the model anymore.

Step-by-step Workflow Guide

This workflow was pretty easy to set up and runs well from the default settings. Here are a few steps where you might want to take extra note.

Steps Recommended Nodes
1. Set the Models

Set the models as seen on the image.
Flux Krea Dev: Photorealistic Portraits Without the AI Look - Workflow + Guide
2. Write a Prompt

Write a prompt that describes a realistic description. It is better to depict a character or portrait inthe prompt.
Flux Krea Dev: Photorealistic Portraits Without the AI Look - Workflow + Guide
3. Check the Sampling

Set the settings as seen on the image. Don't change the scheduler or sampler as it may result to bad quality.
Flux Krea Dev: Photorealistic Portraits Without the AI Look - Workflow + Guide
4. Check the Generated Image

Flux Krea Dev: Photorealistic Portraits Without the AI Look - Workflow + Guide

Examples

Flux Krea Dev: Photorealistic Portraits Without the AI Look - Workflow + Guide
Prompt: A vivid portrait of a modern street artist, spray paint stains on their hands and a colorful bandana covering part of their face. The character stands against a bright graffiti wall, with energetic splashes of neon paint in the background. Their eyes are lively and creative, capturing urban spirit and bold individuality. Studio-quality lighting, crisp detail, portrait format
Flux Krea Dev: Photorealistic Portraits Without the AI Look - Workflow + Guide
Prompt: A close-up portrait of a mysterious elf scholar, pointed ears visible beneath tousled silver hair, intricate crystal earrings, and a deep blue cloak embroidered with ancient runes. The character is posed against a backdrop of weathered library shelves filled with glowing magical tomes. Soft, moody lighting highlights delicate facial features and thoughtful eyes. Highly detailed, fantasy avatar style, portrait orientation
Flux Krea Dev: Photorealistic Portraits Without the AI Look - Workflow + Guide
Prompt: A close-up shot of a young woman with curly light brown hair. The woman's hair is blowing in the wind and cascades over her shoulders. She is wearing a white tank top with a white strap on the back. Her eyes are open and her lips are pursed. The background is blurred out, but it appears to be a sunny day. The sky is a light blue, and the sun is shining down on the left side of the image.
Flux Krea Dev: Photorealistic Portraits Without the AI Look - Workflow + Guide
Prompt: A realistic person shown from the waist up in a landscape-oriented image, positioned randomly within the frame (off-center, left, right, or anywhere unexpected). The person is fully clothed in casual, modern attire—no topless or naked appearance. The background is visually interesting and randomly chosen, such as a lively city street, tranquil beach, cozy café, lush garden, misty park, or an abstract blur. The person has authentic skin texture, natural facial features, and expressive emotion, with realistic lighting and true color tones. No plasticky skin, no overly smooth surfaces, no AI artifacts, no uncanny valley. Composition is wide and balanced, with distinct background elements contributing to the atmosphere. Highly detailed, organic, human appearance
Flux Krea Dev: Photorealistic Portraits Without the AI Look - Workflow + Guide
Prompt: An indoor shot of a man smoking a cigarette. The smoke is coming from behind his head, obscuring his face in the upper right corner of the frame. His left hand is resting on his chin, while his right hand is holding the cigarette in his left hand. His mouth is slightly open, as if he is about to smoke the cigarette. His right ear is visible, and his upper left ear is showing. His eyes are dark, and he has a small amount of light shining on his face. The lighting is subdued, as evidenced by the shadow of the man's face on the right side of the image.
Flux Krea Dev: Photorealistic Portraits Without the AI Look - Workflow + Guide
Prompt: A medium-sized woman stands in front of a colorful graffiti wall. The woman's hair is long and cascades down to her shoulders. She is wearing a black short-sleeved t-shirt and black pants. The word "FLOYO" is written in large, bold letters in a vibrant shade of pink, orange, and black. The letters are outlined in a darker shade of black. The wall behind the woman is a vibrant combination of blue, green, yellow, and orange. The ground beneath her is a dark gray asphalt.
Flux Krea Dev: Photorealistic Portraits Without the AI Look - Workflow + Guide
Prompt: A highly realistic portrait of a person with authentic skin texture and natural facial features, expressive eyes, and genuine emotion. The background is random and visually interesting—such as an urban street, cozy café, lush garden, or abstract blurred scene—adding character to the composition. Lighting is soft and flattering, with true-to-life colors and gentle shadows. No plasticky skin, no AI artifacts, no overly smooth surfaces, no uncanny valley. Rich detail, organic appearance

Troubleshooting

Red Nodes: Install missing custom nodes through ComfyUI Manager
Out of Memory: Use smaller expansion factors or switch to Ultra machine
Poor Quality: Check input image resolution and adjust kontext strength
Visible Seams: Lower strength and ensure good prompt description

If you’re having issues with installation or slow hardware, you can try any of these workflows on a more powerful GPU in your browser with ThinkDiffusion.

If you're having issues with workflow and visit us here at Discord #Help Desk or you may opt to email us at [email protected]

Join the ThinkDiffusion Discord Server!
ThinkDiffusion is your Stable Diffusion workspace in the cloud with unrestricted, bleeding edge opensource AI art tools. | 5449 members
Flux Krea Dev: Photorealistic Portraits Without the AI Look - Workflow + Guide
]]>
<![CDATA[Top 5 ComfyUI Flux Workflows]]>Flux is one of the most popular AI image models in ComfyUI right now. It handles text-to-image generation well, follows prompts accurately, and works across different art styles.

Here are the 5 Flux workflows people are using the most. Each one does something different-from filling in parts of images to

]]>
https://learn.thinkdiffusion.com/top-5-comfyui-flux-workflows/681130e8f02f6900010dcb2fSun, 05 Oct 2025 11:44:41 GMT

Flux is one of the most popular AI image models in ComfyUI right now. It handles text-to-image generation well, follows prompts accurately, and works across different art styles.

Here are the 5 Flux workflows people are using the most. Each one does something different-from filling in parts of images to training custom models to keeping characters looking consistent across scenes.

Pick whichever matches what you're trying to do.

Category Description Link
Inpainting with Reference using Flux Fill and Flux Redux workflow Inpainting with Reference using Flux Fill and Flux Redux lets you naturally fill or edit image areas to match a reference image’s style. View Now
Train Flux Models using Flux workflow Train Flux Models using Flux lets you quickly fine-tune AI models with your own images in ComfyUI. View Now
Image2Image using Flux Controlnet workflow Image2Image using Flux ControlNet lets you transform or enhance images with precise structure and style control by combining your input image, prompts, and advanced ControlNet guidance. View Now
Consistent Character Creating using Flux workflow Consistent Character Creating using Flux lets you generate multiple images of the same character with matching features, style, and identity across different poses and scenes. View Now
Intro to Flux workflow Flux is a state-of-the-art AI model for generating high-quality, detailed images from text prompts, known for its versatility, prompt accuracy, and support for diverse artistic styles View Now

Inpainting with Reference using Flux Fill and Flux Redux

Top 5 ComfyUI Flux Workflows
Top 5 ComfyUI Flux Workflows

What it's great for:

  • Seamlessly inpaints, outpaints, and fills missing or extended areas with natural blending and high visual consistency.
  • Accurately follows user text prompts for content replacement, restyling, and creative modifications
  • Combines styles or images to generate unique variations while preserving important details.
  •  Supports adjustments in aspect ratio, guidance scale, resolution, and blending intensity for tailored results.

Inpainting using reference images is ideal for digital artists, photographers, designers, and content creators who want to restore, enhance, or modify images seamlessly. It benefits anyone needing to remove objects, repair photos, or add new elements that match the original style-making it useful for both professionals and hobbyists.

The Inpainting Revolution: How Reference Images with Flux Fill and Flux Redux Are Changing the Game
Ever needed to add something to an image that wasn’t there before? That’s where Flux Fill and Flux Redux come in – they’re changing the game for image editing by making inpainting (filling in parts of images) look natural and professional. By using models such as Flux Fill and Flux Redux,
Top 5 ComfyUI Flux Workflows

Top 5 ComfyUI Flux Workflows

Train Flux Models using Flux

Top 5 ComfyUI Flux Workflows
Top 5 ComfyUI Flux Workflows

What it's great for:

  • Train Flux models efficiently on machine with low VRAM.
  • Node-based interface to manage the entire training process within the same environment as your image generation workflows.
  • Easily train on small datasets (10–30 images), mix multiple datasets, and apply data augmentation or custom captions for diverse results.
  • Fine-tune training with adjustable parameters like batch size, learning rate, optimizer type.
  • Produce detailed and consistent models suitable for portraits, objects, styles, and more.

Training Flux models with ComfyUI is great for digital artists, designers, developers, and content creators who want to create custom AI image styles or assets. It’s accessible even for beginners and hobbyists, thanks to its user-friendly interface and low hardware requirements, making advanced AI image generation possible for anyone interested in creative or professional projects.

Building Better Models: Flux LoRAs in ComfyUI
What if you could make every image you generate, conform to a certain style or person? This is exactly what using a LoRA model with Flux AI in ComfyUI does. In this guide, we’ll explore how Flux can help you build stronger, more efficient models with ease. Whether you’re new
Top 5 ComfyUI Flux Workflows
Top 5 ComfyUI Flux Workflows

Image2Image using Flux Controlnet

0:00
/0:25
Top 5 ComfyUI Flux Workflows

What it's great for:

  • Integrates edge detection (Canny, HED) and depth maps as information to guide image transformation.
  • All models are trained and optimized for 1024x1024 resolution, enabling the generation of detailed, high-quality images suitable for professional and creative use.

Image2Image Flux with ControlNet is ideal for digital artists, designers, animators, and content creators who want precise control over image transformations. It benefits anyone looking to guide AI-generated images with edge, pose, or depth maps-making it useful for creative projects, marketing visuals, game assets, or personal artwork.

Precision in Flux Art: Harnessing the Power of ControlNet
Flux lets you create impressive images from text prompts. ControlNet is a significant tool for controlling the precise composition of AI images.
Top 5 ComfyUI Flux Workflows
Top 5 ComfyUI Flux Workflows

Consistent Character Creating using Flux

Top 5 ComfyUI Flux Workflows
Top 5 ComfyUI Flux Workflows

What it's great for:

  • Generates multiple character poses and expressions from a single reference image.
  • Maintains strong identity consistency across all outputs (face, clothing, colors).
  • Supports multi-angle and multi-scene character generation for comics, games, and animation.
  • Allows easy style, background, and attribute customization with prompts.

Artists, animators, game developers, marketers, and hobbyists can all benefit from Flux’s consistent character workflow. It’s perfect for anyone who needs uniform character designs across multiple scenes, poses, or styles-making it useful for comics, animation, games, branding, and personal creative projects.

Utilizing Flux in ComfyUI for Consistent Character Creation
Creating characters that look the same every time is crucial. This guide will show you, to maintain consistency by using the workflow in ComfyUI.
Top 5 ComfyUI Flux Workflows
Top 5 ComfyUI Flux Workflows

Intro to Flux

Top 5 ComfyUI Flux Workflows
Top 5 ComfyUI Flux Workflows

What it's great for:

  • Ensures consistent character appearance across multiple images and poses.
  • Works with both reference images and text prompts for flexible input.
  • Offers modular, user-friendly workflows for generation, upscaling, and detailing.
  • Supports various art styles and creative applications like comics, games, and animation.
  • Provides advanced controls for fine-tuning and precise customization.

Flux in ComfyUI is great for digital artists, designers, marketers, content creators, and hobbyists. Anyone who needs quick, high-quality, and consistent AI-generated images for creative projects, branding, or personal use can benefit from using it.

Introduction to Flux - Quick Guide
Flux has bursted on the scene as the defacto AI Art model. It’s here and easy to use on ThinkDiffusion, let’s dive in and show you how it works!
Top 5 ComfyUI Flux Workflows

If your computer's struggling or installation is giving you headaches, try these workflows in your browser with ThinkDiffusion. We provide the GPU power so you can focus on creating.

Enjoy experimenting with these workflows! And remember - every pro started as a beginner once.

If you enjoy ComfyUI and you want to test out HyperSD in ComfyUI and Blender in real-time, then feel free to check out this Real-Time Creativity: Leveraging Hyper SD and Blender with ComfyUI. And have fun out there with your videos!

]]>
<![CDATA[Wan2.2 Workflow + Guide: Turn Text Into Cinematic Video]]>
0:00
/0:05

Prompt: A shy apprentice mage, cloaked in tattered robes, stands beside a glowing portal deep within a misty, enchanted forest at twilight. Strange fireflies flicker around ancient twisted trees, and distant magical runes pulse gently on mossy stones. The camera spirals in from above,

]]>
https://learn.thinkdiffusion.com/wan2-2-workflow-guide-turn-text-into-cinematic-video/6893521f863dc200011272d2Sat, 04 Oct 2025 11:41:21 GMT
0:00
/0:05
Wan2.2 Workflow + Guide: Turn Text Into Cinematic Video

Prompt: A shy apprentice mage, cloaked in tattered robes, stands beside a glowing portal deep within a misty, enchanted forest at twilight. Strange fireflies flicker around ancient twisted trees, and distant magical runes pulse gently on mossy stones. The camera spirals in from above, capturing the mage’s hesitant gestures as arcane sparks dance between their fingers. The sound of whispering leaves and a faint, mystical melody fills the air—immersing viewers in a fantastical, atmospheric scene with lifelike lighting, rich magical effects, and cinematic visual storytelling.

Type a scene description, get a video. That's Wan2.2.

This is an open-source text-to-video model that uses a Mixture-of-Experts system to create realistic motion and accurate visuals. It handles 720p videos with smoother animation and fewer artifacts than version 2.1, and it runs on standard GPUs without needing a server farm.

Describe a fantasy world, a dramatic scene, or something everyday—Wan2.2 turns it into video. Works well for storyboarding, concept testing, or just seeing your ideas move.

What we'll cover

  1. What Wan2.2 is and how it's better than 2.1
  2. Getting the workflow running on ThinkDiffusion
  3. Installing the 6 models you need
  4. Walking through the workflow settings
  5. Real video examples across different styles
  6. Common issues and fixes

The Wan2.2 Release

0:00
/0:05

Wan2.2 is a next-generation, open-source text-to-video model featuring a Mixture-of-Experts (MoE) system for dramatically more realistic motion, prompt accuracy, and cinema-quality visuals compared to Wan2.1. With much larger training data and smart MoE design, it delivers fluid, artifact-free 720p videos quickly and efficiently, even on standard GPUs. Artists, animators, filmmakers, and creators at any level will find Wan2.2 superior for its greater detail, smoother animation, and enhanced creative control—making it the go-to tool for high-quality, prompt-driven video generation.

0:00
/0:05

Prompt: A teenage boy in a faded hoodie bicycles down a rain-slicked suburban street under a brooding twilight sky. The houses’ windows glow warmly as he pedals past, his breath visible in the cold air. The camera tracks alongside at wheel level, water spraying from the tires and reflecting streetlights. Wind ruffles fallen leaves, dogs bark in the distance, and the sound of passing cars merges with distant thunder—evoking a moody, authentic suburban scene with nuanced lighting and a strong sense of realism.

Whether you’re experimenting with fantasy worlds, dramatic scenes, or lifelike moments that pulse with real atmosphere, prepare to be amazed—because with Wan2.2, the magic of cinematic video is just a sentence away waiting to be brought to life by your imagination alone.

Download Workflow

Installation guide

  1. Download the workflow file
  2. Open ComfyUI (local or ThinkDiffusion)
  3. Drag the workflow file into the ComfyUI window
  4. If you see red nodes, install missing components:
  • ComfyUI Manager > Install Missing Custom Nodes

Verified to work on ThinkDiffusion Build: July 9, 2025

ComfyUI v0.3.47 with the use wan2.2_t2v_high_noise_14B_fp8_scaled.safetensors and wan2.2_t2v_low_noise_14B_fp8_scaled.safetensors, and with these LoRAs high_noise_model.safetensors and low_noise_model.safetensors

Note: We specify the build date because ComfyUI and custom node versions updated after this date may change the behavior or outputs of the workflow.

Minimum Machine Size: Ultra

Use the specified machine size or higher to ensure it meets the VRAM and performance requirements of the workflow

💡
Download the workflow and drag & drop it into your ComfyUI window, whether locally or on ThinkDiffusion. If you're using ThinkDiffusion, minimum requirement is the Turbo 24gb machine, but we do recommend the Ultra 48gb machine.

Custom Nodes

If there are red nodes in the workflow, it means that the workflow lacks the certain required nodes. Install the custom nodes in order for the workflow to work.

  1. Go to the ComfyUI Manager  > Click Install Missing Custom Nodes
Wan2.2 Workflow + Guide: Turn Text Into Cinematic Video
  1. Check the list below if there's a list of custom nodes that needs to be installed and click the install.
Wan2.2 Workflow + Guide: Turn Text Into Cinematic Video

Required Models

For this guide you'll need to download these 6 recommended models.

1. wan2.2_t2v_high_noise_14B_fp8_scaled.safetensors
2. wan2.2_t2v_low_noise_14B_fp8_scaled.safetensors
3. umt5-xxl-enc-bf16.safetensors
4. wan_2.1_VAE_bf16.safetensors
5. high_noise_model.safetensors
6. low_noise_model.safetensors
  1. Go to ComfyUI Manager  > Click Model Manager
Wan2.2 Workflow + Guide: Turn Text Into Cinematic Video
  1. Search for the models above and when you find the exact model that you're looking for, click install, and make sure to press refresh when you are finished.
Wan2.2 Workflow + Guide: Turn Text Into Cinematic Video

If Model Manager doesn't have them: Use direct download links (included with workflow) and upload through ThinkDiffusion MyFiles > Upload URL. Refer our docs for more guidance on this.

You could also use the model path source instead: by pasting the model's link address into ThinkDiffusion MyFiles using upload URL.

Model Name Model Link Address ThinkDiffusion Upload Directory
wan2.2_t2v_high_noise_14B_fp8_scaled.safetensors
📋 Copy Path
.../comfyui/models/diffusion_models/
wan2.2_t2v_low_noise_14B_fp8_scaled.safetensors
📋 Copy Path
.../comfyui/models/diffusion_models/
umt5-xxl-enc-bf16.safetensors
📋 Copy Path
.../comfyui/models/text_encoders/
wan_2.1_VAE_bf16.safetensors
📋 Copy Path
.../comfyui/models/vae/
high_noise_model.safetensors
📋 Copy Path
.../comfyui/models/lora/
low_noise_model.safetensors
📋 Copy Path
.../comfyui/models/lora/
💡
You need to rename lightning x2v LorA to your desired name, you can do it before you upload or rename it right into your My Files of ThinkDiffusion. It needs to be renamed because it has the same of LoRA for T2V and I2V of LightningX2V.

Step-by-step Workflow Guide

This workflow was pretty easy to set up and runs well from the default settings. Here are a few steps where you might want to take extra note.

Steps Recommended Nodes
1. Set the Models

Set the models as seen on the image. Enabled the low mem load if you experienced the out of memory.
Wan2.2 Workflow + Guide: Turn Text Into Cinematic Video
2. Write a Prompt

Write a detailed prompt. Wan2.2 model is good in adherence of prompt. Set the size to 480p only. Wan2.2 is compatible with 720 and 1080 resolution and higher frames but you need a higher machine for that.
Wan2.2 Workflow + Guide: Turn Text Into Cinematic Video
3. Check Sampling

Set the sampling setting as seen on the image. Since it uses a lightning x2v lora, the inference steps should be at 4 only. Otherwise, it will result to an error.
Wan2.2 Workflow + Guide: Turn Text Into Cinematic Video
4. Check the Video

Wan2.2 Workflow + Guide: Turn Text Into Cinematic Video
💡
I was only able to run the workflow at 480p resolution and process up to 81 frames due to the limitations of my current hardware. However, if you have a more powerful ComfyUI setup, I recommend utilizing the higher-capacity Wan2.2 model for improved performance and output quality. Robust hardware resources will allow you to take full advantage of advanced models and process higher resolutions or longer sequences more efficiently.
💡
From my experience, using the Lightning x2v LoRA in the workflow is crucial for optimizing generation speed. Whenever I include this LoRA, I’ve noticed a significant reduction in processing time, allowing me to complete tasks far more efficiently. Without it, the generation process can be extremely sluggish—sometimes taking up to 30 minutes for a single output. Leveraging the Lightning x2v LoRA has become an essential part of my workflow to ensure fast and reliable results.

Examples

0:00
/0:05

Prompt: A curious animated raccoon wearing a tiny yellow raincoat tiptoes through a moonlit alley crowded with overflowing trash bins and twinkling puddles. Neon reflections shimmer on slick cobblestones as the raccoon sniffs around, occasionally startled by animated cats darting past or cans tumbling noisily. The camera follows low to the ground, highlighting the raccoon’s expressive eyes, fluffy tail, and gentle paws. Soft jazz plays in the background, muffled by distant city hum, enriching the whimsical, atmospheric animated scene with vivid nighttime textures and lively character animation.

0:00
/0:05

Prompt: A timid maintenance worker, clutching a flashlight, descends into a labyrinthine subway tunnel after midnight. Shadows creep along cracked tiles and ancient graffiti as unnatural screeches echo from deep within the darkness. The camera pans slowly through the eerie silence, glancing over the worker’s terrified face just as a monstrous, skeletal creature with glowing eyes slithers from the shadows behind rusted pipes. Flickering lights reveal jagged claws and a sinister grin, while the tunnel fills with chilling whispers, distant thunder, and heart-pounding footsteps—immersing viewers in a suspenseful, terrifying scene with detailed, cinematic horror atmosphere and truly frightening monster realism.

0:00
/0:05

Prompt: A cheerful animated robot with bright, glowing eyes and spindly limbs bounces across a moonlit carnival filled with oversized balloons and swirling rides. Colorful lights flash across the robot’s reflective body as it weaves between laughing animated animals and swirling confetti. The camera smoothly follows its energetic movements, capturing sparks that fly from its hands as it dances on a spinning carousel. Background music mixes playful electronic beats with distant giggles, bringing the whimsical animated scene to life with vivid detail, dynamic lighting, and expressive character animation.

0:00
/0:05

Prompt: A reserved florist in an apron quietly arranges bouquets in her shop during a gentle afternoon rain. Outside, a young musician stands beneath the awning, strumming a soft tune as passersby hurry past. The camera glides from the rain-dappled window to the florist’s thoughtful smile, then out to the musician meeting her gaze. Petals scatter on the countertop, rain streaks the glass, and the subdued city sounds blend with tender chords—creating an intimate, atmospheric romance scene, brought to life with authentic lighting, emotion, and cinematic mood.

0:00
/0:05

Prompt: A retired detective in a worn trench coat quietly surveys an abandoned train station shrouded in thick morning fog. He holds an old photograph, scanning empty benches and flickering overhead lights. The camera moves slowly between cracked tiles and rusty tracks, focusing on the detective’s tense posture and sharp gaze. Distant echoes of footsteps, the hum of departing trains, and swirling mist fill the soundscape—creating a suspenseful, moody mystery scene with cinematic depth, authentic atmosphere, and lifelike environmental details.

0:00
/0:05

Prompt: A weary space mechanic, dressed in a patched jumpsuit, floats outside a battered starship as a nebula glows in the distance. His helmet visor reflects flickers from distant lightning storms. The camera glides slowly along the hull, capturing him making delicate repairs against the eerie, luminous backdrop. Tools drift beside him, while transmission crackles and distant alarms layer the soundscape. Each movement creates swirling flashes of color and shadow—evoking the tension and isolation of sci-fi space travel with atmospheric, movie-quality lighting and realistic technical detail.

0:00
/0:05

Prompt: An exhausted firefighter, his face streaked with soot and sweat, stands in the middle of a rain-soaked street at night. Neon signs flicker in the hazy background as emergency lights flash across glistening puddles. The camera starts at ground level, slowly dollying upward and forward to frame the firefighter’s determined expression in close-up. Street reflections shimmer, rain falls gently, and distant sirens echo—capturing an atmosphere of tension, resilience, and gritty realism with movie-grade lighting and natural color grading.


Troubleshooting

Red Nodes: Install missing custom nodes through ComfyUI Manager
Out of Memory (Allocation on Device error): Use smaller expansion factors, lower resolution, or less frames. Optionally try higher VRAM machines such as ULTRA or newly available NITRO (beta feature). Note that even with more VRAM, you may still encounter out-of-memory issues with Wan2.2.
Poor Quality: Check input image resolution and adjust kontext strength
Visible Seams: Lower strength and ensure good prompt description

If you’re having issues with installation or slow hardware, you can try any of these workflows on a more powerful GPU in your browser with ThinkDiffusion.

If you're having issues with workflow and visit us here at Discord #Help Desk or you may opt to email us at [email protected]

Join the ThinkDiffusion Discord Server!
ThinkDiffusion is your Stable Diffusion workspace in the cloud with unrestricted, bleeding edge opensource AI art tools. | 5471 members
Wan2.2 Workflow + Guide: Turn Text Into Cinematic Video
]]>
<![CDATA[Qwen Image2Image Edit: Run in the Browser + Guide]]>
Prompt: Transform the image into realistic image.

Change backgrounds. Swap objects. Add stuff. Remove stuff. Adjust styles. All through simple text prompts instead of wrestling with complicated tools.

Qwen is Alibaba's image editing model, built on their 20B-parameter foundation. It handles object manipulation, style transfers, and even text

]]>
https://learn.thinkdiffusion.com/qwen-image2image-edit-run-in-the-browser-guide/68c003b5bf2ce000013f445aFri, 03 Oct 2025 11:24:18 GMTQwen Image2Image Edit: Run in the Browser + Guide
Prompt: Transform the image into realistic image.
Qwen Image2Image Edit: Run in the Browser + Guide

Change backgrounds. Swap objects. Add stuff. Remove stuff. Adjust styles. All through simple text prompts instead of wrestling with complicated tools.

Qwen is Alibaba's image editing model, built on their 20B-parameter foundation. It handles object manipulation, style transfers, and even text editing inside images. The results are surprisingly realistic, and it keeps context better than you'd expect.

Useful if you're creating content, designing stuff, running social media, or just want to edit images without learning Photoshop.

What we'll cover

  1. What Qwen Image Edit actually is and what makes it different
  2. Getting the workflow running on ThinkDiffusion
  3. Installing the models and custom nodes you need
  4. Walking through the workflow step-by-step
  5. Real examples of what it can do
  6. Common issues and how to fix them

What is Qwen Image Edit?

Qwen Image2Image Edit: Run in the Browser + Guide
Source: Qwen Image

Qwen Image Edit is a model developed by Alibaba's Qwen team, built upon their robust 20B-parameter Qwen-Image foundation. This model brings precise object manipulation, accurate style and background transfer, and dual-language text editing directly within images. It does a solid job with realism and keeping details intact, even when you're asking it to do tricky edits.

Ideal for content creators, designers, marketers, social media teams, localization experts, e-commerce businesses, and anyone seeking intuitive, professional-grade image editing through the power of natural language.

Qwen Image2Image Edit: Run in the Browser + Guide
Prompt: Transform the image crochet style.

So go ahead: give your edits a voice, and see just how far your words can take your next photo adventure!

Download Workflow

Installation guide

  1. Download the workflow file
  2. Open ComfyUI (local or ThinkDiffusion)
  3. Drag the workflow file into the ComfyUI window
  4. If you see red nodes, install missing components:
  • ComfyUI Manager > Install Missing Custom Nodes

Verified to work on ThinkDiffusion Build: September 5, 2025

ComfyUI v0.3.57 with the use qwen_image_edit_fp8_e4m3fn.safetensors
and qwen_2.5_vl_7b_fp8_scaled.safetensors

Note: We specify the build date because ComfyUI and custom node versions updated after this date may change the behavior or outputs of the workflow.

Minimum Machine Size: Ultra

Use the specified machine size or higher to ensure it meets the VRAM and performance requirements of the workflow

💡
Download the workflow and drag & drop it into your ComfyUI window, whether locally or on ThinkDiffusion. If you're using ThinkDiffusion, minimum requirement is the Turbo 24gb machine, but we do recommend the Ultra 48gb machine.

Custom Nodes

If there are red nodes in the workflow, it means that the workflow lacks the certain required nodes. Install the custom nodes in order for the workflow to work.

  1. Go to the ComfyUI Manager  > Click Install Missing Custom Nodes
Qwen Image2Image Edit: Run in the Browser + Guide
  1. Check the list below if there's a list of custom nodes that needs to be installed and click the install.
Qwen Image2Image Edit: Run in the Browser + Guide

Required Models

For this guide you'll need to download these 3 recommended models.

1. qwen_image_edit_fp8_e4m3fn.safetensors
2. qwen_2.5_vl_7b_fp8_scaled.safetensors
3. qwen_image_vae.safetensors
  1. Go to ComfyUI Manager  > Click Model Manager
Qwen Image2Image Edit: Run in the Browser + Guide
  1. Search for the models above and when you find the exact model that you're looking for, click install, and make sure to press refresh when you are finished.
Qwen Image2Image Edit: Run in the Browser + Guide

If Model Manager doesn't have them: Use direct download links (included with workflow) and upload through ThinkDiffusion MyFiles > Upload URL. Refer our docs for more guidance on this.

You could also use the model path source instead: by pasting the model's link address into ThinkDiffusion MyFiles using upload URL.

Model Name Model Link Address ThinkDiffusion Upload Directory
qwen_image_edit_fp8_e4m3fn.safetensors
📋 Copy Path
.../comfyui/models/diffusion_models/
qwen_2.5_vl_7b_fp8_scaled.safetensors
📋 Copy Path
.../comfyui/models/text_encoders/
qwen_image_vae.safetensors
📋 Copy Path
.../comfyui/models/vae/

Step-by-step Workflow Guide

This workflow was pretty easy to set up and runs well from the default settings. Here are a few steps where you might want to take extra note.

Steps Recommended Nodes
1. Load an Input Image

Load an input image. Any minimal resolution will do as long it has a high quality image.
Qwen Image2Image Edit: Run in the Browser + Guide
2. Set the Models

Set the models as seen on the image. If you have good hardware you can use the full model but needs to be manual download.
Qwen Image2Image Edit: Run in the Browser + Guide
3. Write the Prompt

Write a simple which serves as instruction of what kind image edit you want to initiate.
Qwen Image2Image Edit: Run in the Browser + Guide
4. Check the Sampling

Set the settings as seen on the image.
Qwen Image2Image Edit: Run in the Browser + Guide
5. Check the Output

Qwen Image2Image Edit: Run in the Browser + Guide
💡
When I try complex edits or chain too many instructions, the results can include artifacts, context loss, or image offsets, especially in heavy transformation scenarios.
💡
While text rendering and semantic edits are strong, I have to watch out for occasional mismatches, artifacts, or unexpected outcomes until newer versions address these issues.
💡
Sometimes, when I edit an image, Qwen Image Edit changes the aspect ratio or introduces zoom, which means the output doesn’t perfectly match my original framing or pixel dimensions.

Examples

IP Creation

Qwen Image2Image Edit: Run in the Browser + Guide
Prompt: This dog wears a doctor suit, no helmet on its head and wears a stethoscope.

Novel View Synthesis

Qwen Image2Image Edit: Run in the Browser + Guide
Prompt: Obtain the back-side of toy.

Avatar Creator

Qwen Image2Image Edit: Run in the Browser + Guide
Prompt: Transform the image into Ghibli style.

Object Add

Qwen Image2Image Edit: Run in the Browser + Guide
Prompt: Add a realistic cat beside the dog.

Object Removal

Qwen Image2Image Edit: Run in the Browser + Guide
Prompt: Remove the bird.

Object Replace

Qwen Image2Image Edit: Run in the Browser + Guide
Prompt: Replace the coffee with coke in can.

Background Swap

Qwen Image2Image Edit: Run in the Browser + Guide
Prompt: Replace the background with beach.

Virtual Try-On

Qwen Image2Image Edit: Run in the Browser + Guide
Prompt: Replace the woman dress into a futuristic cyberpunk dress.

Text Editing

Qwen Image2Image Edit: Run in the Browser + Guide
Prompt: Replace the 'Hard Rock' to ThinkDiffusion'

Troubleshooting

Red Nodes: Install missing custom nodes through ComfyUI Manager
Out of Memory: Use smaller expansion factors or switch to Ultra machine
Poor Quality: Check input image resolution and adjust kontext strength
Visible Seams: Lower strength and ensure good prompt description

If you’re having issues with installation or slow hardware, you can try any of these workflows on a more powerful GPU in your browser with ThinkDiffusion.

Join the ThinkDiffusion Discord Server!
ThinkDiffusion is your Stable Diffusion workspace in the cloud with unrestricted, bleeding edge opensource AI art tools. | 5510 members
Qwen Image2Image Edit: Run in the Browser + Guide
]]>
<![CDATA[No More Heavy Models! WAN2.2 Rapid-AllInOne Makes Video Generation Easy]]>
0:00
/0:03

Prompt: A male assassin dances fluidly atop a city rooftop at night, full dark robe attire blending modern tactical gear with elegant, flowing elements. Neon lights from the city skyline reflect off his outfit as he moves with precision and grace, his silhouette striking

]]>
https://learn.thinkdiffusion.com/no-more-heavy-models-wan2-2-rapid-allinone-makes-video-generation-easy/68cd3c412dfdfd00012b0489Mon, 29 Sep 2025 14:24:52 GMT
0:00
/0:03
No More Heavy Models! WAN2.2 Rapid-AllInOne Makes Video Generation Easy

Prompt: A male assassin dances fluidly atop a city rooftop at night, full dark robe attire blending modern tactical gear with elegant, flowing elements. Neon lights from the city skyline reflect off his outfit as he moves with precision and grace, his silhouette striking against the urban backdrop.

Tired of dealing with complex setups and high memory demands just to make a single video? WAN2.2 Rapid-AllInOne changes the game by combining the best of WAN 2.2 and its accelerators into one lightweight, user-friendly model that works fast—even on lower VRAM systems.

Making AI videos usually means downloading multiple huge model files, dealing with memory errors, and babysitting complicated setups. WAN 2.2 Rapid-AllInOne rolls everything into a single file that runs faster and needs less VRAM.

It combines WAN 2.2, its accelerators, CLIP, and VAE into one model created by Phr00t. You load one file instead of several, it runs in 4 sampling steps, and works on lower-end hardware. Whether you're starting from an image or a text prompt, you get smooth motion without the usual technical headaches.

What we'll cover:

  1. Why this model is easier than regular WAN 2.2
  2. How to install and set up the workflow
  3. Running different generation modes (image-to-video, text-to-video, etc.)
  4. Examples and troubleshooting

Why Use Wan 2.2 Rapid-AllInOne?

0:00
/0:05

Prompt: A toy vehicle sits on a polished wooden tabletop, captured from a dramatic front-facing perspective. Suddenly, a pair of playful child’s fingers gently grasp the miniature vehicle and rotate it, revealing its detailed rear view. 

Rapid-AllInOne is a fast, all-in-one AI video generation model developed by the creator “Phr00t,” who merged WAN 2.2 and various accelerators, along with CLIP and VAE, to deliver rapid, simplified video creation for image-to-video and text-to-video workflows. Its standout features are single-file convenience, suitable for low VRAM usage, and native integration of CLIP, VAE, and WAN accelerators for high performance and flexible output. The model is designed for speed, requiring only 4 sampling steps and 1 CFG, supporting dynamic tasks like last-frame and first-to-last-frame generation while ensuring compatibility with both WAN 2.1 and 2.2 LORAs.

Download Workflow

Installation guide

  1. Download the workflow file
  2. Open ComfyUI (local or ThinkDiffusion)
  3. Drag the workflow file into the ComfyUI window
  4. If you see red nodes, install missing components:
  • ComfyUI Manager > Install Missing Custom Nodes

Verified to work on ThinkDiffusion Build: September 5, 2025

ComfyUI v0.3.57 with the use wan2.2-rapid-mega-aio-v1.safetensors
model support

Minimum Machine Size: Turbo

Use the specified machine size or higher to ensure it meets the VRAM and performance requirements of the workflow

💡
Download the workflow and drag & drop it into your ComfyUI window, whether locally or on ThinkDiffusion. If you're using ThinkDiffusion, minimum requirement is the Turbo 24gb machine, but we do recommend the Ultra 48gb machine.

Custom Nodes

If there are red nodes in the workflow, it means that the workflow lacks the certain required nodes. Install the custom nodes in order for the workflow to work.

  1. Go to the ComfyUI Manager  > Click Install Missing Custom Nodes
No More Heavy Models! WAN2.2 Rapid-AllInOne Makes Video Generation Easy
  1. Check the list below if there's a list of custom nodes that needs to be installed and click the install.
No More Heavy Models! WAN2.2 Rapid-AllInOne Makes Video Generation Easy

Required Models

For this guide you'll need to download these 1 recommended models.

1. wan2.2-rapid-mega-aio-v1.safetensors
  1. Go to ComfyUI Manager  > Click Model Manager
No More Heavy Models! WAN2.2 Rapid-AllInOne Makes Video Generation Easy
  1. Search for the models above and when you find the exact model that you're looking for, click install, and make sure to press refresh when you are finished.
No More Heavy Models! WAN2.2 Rapid-AllInOne Makes Video Generation Easy

If Model Manager doesn't have them: Use direct download links (included with workflow) and upload through ThinkDiffusion MyFiles > Upload URL. Refer our docs for more guidance on this.

You could also use the model path source instead: by pasting the model's link address into ThinkDiffusion MyFiles using upload URL.

Model Name Model Link Address ThinkDiffusion Upload Directory
wan2.2-rapid-mega-aio-v1.safetensors
📋 Copy Path
.../comfyui/models/checkpoint/

Step-by-step Workflow Guide

This workflow was pretty easy to set up and runs well from the default settings. Here are a few steps where you might want to take extra note.

Steps Recommended Nodes
1. Set Model

Set the exact model as seen on the image. Use only the rapid mega model which can handle a different types of generation mode.
No More Heavy Models! WAN2.2 Rapid-AllInOne Makes Video Generation Easy
2. Load Input

Load an input image. You can bypass any of these nodes depends of what type of generation mode you are using.
No More Heavy Models! WAN2.2 Rapid-AllInOne Makes Video Generation Easy
3. Write Prompt

Write the required prompt. It is recommended to write a detailed prompt.
No More Heavy Models! WAN2.2 Rapid-AllInOne Makes Video Generation Easy
4. Check Sampling

Check the sampling settings. Set it based from the recommended settings of the image.
No More Heavy Models! WAN2.2 Rapid-AllInOne Makes Video Generation Easy
5. Check Output

No More Heavy Models! WAN2.2 Rapid-AllInOne Makes Video Generation Easy
💡
I2V mode: Just bypass the "end frame" so the "start frame" will be your I2V starting frame. Keep everything else the same.

T2V mode: Bypass "end frame", "start frame" and the "VACEFirstToLastFrame" node. Set strength to 0 for WanVaceToVideo.

Last Frame mode: Just bypass the "start frame" and keep "end frame". Keep everything else the same as in the picture.

First->Last Frame mode: Use the default workflow of the page.

EXAMPLES

Image to Video

0:00
/0:03

Prompt: An enchanting, lifelike painting comes to life, portraying a cheerful young boy with tousled hair who eagerly greets a smiling girl amid the vibrant greenery of a sunlit park. Their joyful expressions illuminate the scene as sunlight dances across their faces, casting soft shadows beneath swaying trees. Surrounding them, blooming flowers and fluttering butterflies add colorful accents, while distant laughter and a gentle breeze infuse the moment with a sense of carefree happiness.

Text to Video

0:00
/0:03

Prompt: A woman in casual attire strolls across a vast, sunlit expanse of dry land under a brilliant blue sky. Her relaxed outfit flutters gently in the warm breeze as she walks, surrounded by greeny grasses and distant hills. The scene is rendered in a whimsical, soft Ghibli art style, with vibrant colors and a peaceful, dreamy atmosphere.

Last Frame Video

0:00
/0:03

Prompt: Vivid blue butterflies flutter gracefully down onto the soft, emerald-green forest floor, their delicate wings shimmering in the dappled sunlight that filters through the towering canopy above. Around them, a sparse cluster of wild mushrooms rises from the moss, their ivory caps speckled with dew, adding texture and subtle color to the woodland scenery. 

First - Last Frame Video

0:00
/0:03

Prompt: A graceful tabby cat with soft golden fur, alert green eyes, and delicate whiskers strolls quietly through the lush, dew-covered grass on a misty morning. With each step, the gentle sunlight illuminates tiny droplets clinging to her paws. Suddenly, she pauses, her gaze fixed on a woven basket nestled among wildflowers—a basket brimming with freshly laid eggs. The scene glows with warm, natural light, highlighting the intricate textures of the cat’s fur, the glistening grass, and the smooth eggshells.


Troubleshooting

Red Nodes: Install missing custom nodes through ComfyUI Manager
Out of Memory: Use smaller expansion factors or switch to Ultra machine
Poor Quality: Check input image resolution and adjust kontext strength
Visible Seams: Lower strength and ensure good prompt description

If you’re having issues with installation or slow hardware, you can try any of these workflows on a more powerful GPU in your browser with ThinkDiffusion.

Join the ThinkDiffusion Discord Server!
ThinkDiffusion is your Stable Diffusion workspace in the cloud with unrestricted, bleeding edge opensource AI art tools. | 5573 members
No More Heavy Models! WAN2.2 Rapid-AllInOne Makes Video Generation Easy
]]>
<![CDATA[How to Use Qwen Image with InstantX Union ControlNet in ComfyUI - Guide + Workflow]]>
Prompt: A hyper-realistic image of a man relaxing on a sunny Mediterranean terrace, dressed in breezy coastal resort fashion: lightweight linen button-down shirt, white cuffed chinos, woven leather sandals, vintage sunglasses, and a woven straw fedora. He sits comfortably in a rattan lounge chair surrounded by vibrant ceramic planters, sun-bleached
]]>
https://learn.thinkdiffusion.com/how-to-use-qwen-image-with-instantx-union-controlnet-in-comfyui-guide-workflow/68c2becf2dfdfd00012b03bfWed, 24 Sep 2025 08:58:16 GMTHow to Use Qwen Image with InstantX Union ControlNet in ComfyUI - Guide + Workflow
Prompt: A hyper-realistic image of a man relaxing on a sunny Mediterranean terrace, dressed in breezy coastal resort fashion: lightweight linen button-down shirt, white cuffed chinos, woven leather sandals, vintage sunglasses, and a woven straw fedora. He sits comfortably in a rattan lounge chair surrounded by vibrant ceramic planters, sun-bleached stone flooring, trailing bougainvillea vines, sparkling blue sea in the distant background, sharp daylight and natural shadows, tranquil afternoon ambiance, detailed skin and fabric textures
How to Use Qwen Image with InstantX Union ControlNet in ComfyUI - Guide + Workflow

Generating Qwen images with Controlnet unlocks a powerful way to guide your AI creations using visual structure, lines, and forms drawn or extracted from reference images. Want better control over your AI image generation? Here's how to use Qwen Image with InstantX Union ControlNet to guide your creations with poses, edges, and depth maps.

With just a simple pose, edge, depth map, or quick sketch, you can shape exactly how your output looks. Whether you're working on precise designs or expressive portraits, this workflow gives you the control you need without the complexity.

Here's what we'll cover:
1. Why InstantX Union beats DiffSynth
2. Getting the workflow set up
3. Required models and custom nodes
4. Step-by-step walkthrough
5. Real examples and troubleshooting

Why InstantX Union is Better than DiffSynth?

How to Use Qwen Image with InstantX Union ControlNet in ComfyUI - Guide + Workflow
Source: Qwen Image InstantX Union

InstantX Union ControlNet combines four control types (canny, soft edge, depth, and pose) into one model file. Instead of downloading separate models for each control type, you get everything in one package.

Unlike DiffSynth, which makes you load different models for different tasks, InstantX Union lets you switch between control types instantly. Less storage, less setup, same quality.

How to Use Qwen Image with InstantX Union ControlNet in ComfyUI - Guide + Workflow
Prompt: A cozy Bohemian living room, layered patterned rugs over rustic hardwood floors, low-slung vintage couches with colorful embroidered pillows, eclectic gallery wall art, clusters of hanging macramé planters and trailing greenery, carved wooden coffee table, rattan accent chairs, textured woven blankets, lantern-style ambient lighting, warm earthy color palette, relaxed and inviting atmosphere, artistic and collected look

For Qwen Image users, this means creating complex, high-quality images with simpler setup, better compatibility, and instant access to the most common control modes all in a unified, user-friendly package.

Download Workflow

Installation guide

  1. Download the workflow file
  2. Open ComfyUI (local or ThinkDiffusion)
  3. Drag the workflow file into the ComfyUI window
  4. If you see red nodes, install missing components:
  • ComfyUI Manager > Install Missing Custom Nodes

Verified to work on ThinkDiffusion Build: September 5, 2025

ComfyUI v0.3.57 with the use qwen_image_fp8_e4m3fn.safetensors
models

Note: We specify the build date because ComfyUI and custom node versions updated after this date may change the behavior or outputs of the workflow.

Minimum Machine Size: Ultra

Use the specified machine size or higher to ensure it meets the VRAM and performance requirements of the workflow

💡
Download the workflow and drag & drop it into your ComfyUI window, whether locally or on ThinkDiffusion. If you're using ThinkDiffusion, minimum requirement is the Turbo 24gb machine, but we do recommend the Ultra 48gb machine.

Custom Nodes

If there are red nodes in the workflow, it means that the workflow lacks the certain required nodes. Install the custom nodes in order for the workflow to work.

  1. Go to the ComfyUI Manager  > Click Install Missing Custom Nodes
How to Use Qwen Image with InstantX Union ControlNet in ComfyUI - Guide + Workflow
  1. Check the list below if there's a list of custom nodes that needs to be installed and click the install.
How to Use Qwen Image with InstantX Union ControlNet in ComfyUI - Guide + Workflow

Required Models

For this guide you'll need to download these 4 recommended models.

1. qwen_image_fp8_e4m3fn.safetensors
2. qwen_2.5_vl_7b_fp8_scaled.safetensors
3. qwen_image_vae.safetensors
4. Qwen-Image-InstantX-ControlNet-Union.safetensors
  1. Go to ComfyUI Manager  > Click Model Manager
How to Use Qwen Image with InstantX Union ControlNet in ComfyUI - Guide + Workflow
  1. Search for the models above and when you find the exact model that you're looking for, click install, and make sure to press refresh when you are finished.
How to Use Qwen Image with InstantX Union ControlNet in ComfyUI - Guide + Workflow

If Model Manager doesn't have them: Use direct download links (included with workflow) and upload through ThinkDiffusion MyFiles > Upload URL. Refer our docs for more guidance on this.

You could also use the model path source instead: by pasting the model's link address into ThinkDiffusion MyFiles using upload URL.

Model Name Model Link Address ThinkDiffusion Upload Directory
qwen_image_fp8_e4m3fn.safetensors
📋 Copy Path
.../comfyui/models/diffusion_models/
qwen_2.5_vl_7b_fp8_scaled.safetensors
📋 Copy Path
.../comfyui/models/text_encoders/
qwen_image_vae.safetensors
📋 Copy Path
.../comfyui/models/vae/
Qwen-Image-InstantX-ControlNet-Union.safetensors
📋 Copy Path
.../comfyui/models/controlnet/

Step-by-step Workflow Guide

This workflow was pretty easy to set up and runs well from the default settings. Here are a few steps where you might want to take extra note.

Steps Recommended Nodes
1. Load Input Image

Load an image. Image should be in good quality. Any resolution will do.
How to Use Qwen Image with InstantX Union ControlNet in ComfyUI - Guide + Workflow
2. Set a Controlnet

Set your desired controlnet based on your preferences. If there a human in the image you can use the pose.
How to Use Qwen Image with InstantX Union ControlNet in ComfyUI - Guide + Workflow
3. Set the Models

Set the exact models as seen on the image.
How to Use Qwen Image with InstantX Union ControlNet in ComfyUI - Guide + Workflow
4. Write a Prompt

Write a detailed of what of you kind of new image you want be in the input image.
How to Use Qwen Image with InstantX Union ControlNet in ComfyUI - Guide + Workflow
5. Check Sampling

Set the sampling as seen on the image.
How to Use Qwen Image with InstantX Union ControlNet in ComfyUI - Guide + Workflow
6. Check Output

How to Use Qwen Image with InstantX Union ControlNet in ComfyUI - Guide + Workflow
💡
I use Canny for crisp, accurate line control in things like architecture or detailed designs. I switch to Soft Edge when I want smoother, more natural guidance for portraits or landscapes. I rely on Depth whenever 3D space and realistic perspective are important things like background consistency or lighting. For Pose, I apply it to human figures when I need precise control over body position or gestures, making sure characters look natural and expressive.
💡
I experimented with several ControlNet models beyond the standard four, exploring a range of options to expand the workflow’s capabilities. However, I observed that using these alternative models often leads to unpredictable or suboptimal results, such as unusual visual distortions or unintended effects. For this reason, I recommend exercising caution when integrating non-standard ControlNet models into your workflow and thoroughly testing each model to ensure consistent output quality.

Examples

How to Use Qwen Image with InstantX Union ControlNet in ComfyUI - Guide + Workflow
Prompt: A highly-detailed robotic rhinoceros, gleaming chrome armor and neon blue LED accents, imposing mechanical form, walking through a bustling futuristic cyberpunk city at night, surrounded by towering skyscrapers with holographic billboards, atmospheric neon lights, reflective wet streets, flying vehicles in the background, electric mist and digital rain, vibrant but moody color scheme, cinematic composition, inspired by sci-fi and cyberpunk visuals, sharp focus, dynamic lighting, techno-futuristic aesthetic
How to Use Qwen Image with InstantX Union ControlNet in ComfyUI - Guide + Workflow
Prompt: A playful dog with expressive features, set in a dreamlike landscape of floating islands with impossible geometry, lush grass, sparkling waterfalls cascading into the clouds, vibrant pastel skies, hints of rainbow light, whimsical and surreal atmosphere, tranquil and magical mood, high detail, painterly finish
How to Use Qwen Image with InstantX Union ControlNet in ComfyUI - Guide + Workflow
Prompt: A sinister-looking boy with sharp eyes and an intense expression, standing in a shadowy villain lair filled with glowing red and green control panels, massive digital screens flashing ominous warnings, metallic walls lined with exposed wires and pulsing energy conduits, dark atmospheric lighting, mysterious swirling smoke at his feet, futuristic weapons and artifacts scattered around, sleek black and dark purple clothing with high-collar jacket and metallic accents, glowing symbol on his glove, intimidating and clever appearance, cinematic mood, high detail
How to Use Qwen Image with InstantX Union ControlNet in ComfyUI - Guide + Workflow
Prompt: An ancient stone building with massive weathered walls, rough-hewn blocks and primitive mortar, narrow arched doorways, tiny slit windows for defense, heavy wooden gates reinforced with iron, moss and creeping vines covering the crumbling exterior, worn flagstones leading to the entrance, rustic torch sconces, simple geometric carvings, historic atmosphere reminiscent of early medieval or prehistoric architecture, cloudy skies and soft diffused lighting, emphasis on age and durability

Troubleshooting

Red Nodes: Install missing custom nodes through ComfyUI Manager
Out of Memory: Use smaller expansion factors or switch to Ultra machine
Poor Quality: Check input image resolution and adjust kontext strength
Visible Seams: Lower strength and ensure good prompt description

If you’re having issues with installation or slow hardware, you can try any of these workflows on a more powerful GPU in your browser with ThinkDiffusion.

Join the ThinkDiffusion Discord Server!
ThinkDiffusion is your Stable Diffusion workspace in the cloud with unrestricted, bleeding edge opensource AI art tools. | 5510 members
How to Use Qwen Image with InstantX Union ControlNet in ComfyUI - Guide + Workflow
]]>
<![CDATA[Latest in Lipsync: InfiniteTalk Video2Video ComfyUI Guide]]>
0:00
/0:20

Ever found yourself wishing a portrait could actually speak, sharing stories with real movement and emotion? Now, that spark of imagination is within reach—no complicated setups required. With just a bit of creative input, you can watch your favorite images transform into

]]>
https://learn.thinkdiffusion.com/latest-in-lipsync-infinitetalk-video2video-comfyui-guide/68b04d1abf2ce000013f3963Tue, 02 Sep 2025 16:41:16 GMT
0:00
/0:20
Latest in Lipsync: InfiniteTalk Video2Video ComfyUI Guide

Ever found yourself wishing a portrait could actually speak, sharing stories with real movement and emotion? Now, that spark of imagination is within reach—no complicated setups required. With just a bit of creative input, you can watch your favorite images transform into lifelike, expressive talking portraits that surprise, engage, and even make you do a double-take.

What is InfiniteTalk? What are the Key Features?

0:00
/0:23

InfiniteTalk is a powerful audio-driven video generation model designed to create unlimited-length talking avatar videos with exceptionally accurate lip sync, natural head and body movements, and stable facial expressions—all seamlessly aligning to input audio. What sets InfiniteTalk apart from MultiTalk is its enhanced stability, dramatically reduced distortions in hands and body, and superior lip synchronization, making each generated video look more realistic and less prone to awkward or exaggerated motion.

InfiniteTalk is an audio-driven video generation model that creates realistic talking avatar videos from static images or existing videos. It provides:

  • Unlimited video length - Generate videos of any duration
  • Accurate lip sync - Audio perfectly matches mouth movements
  • Natural motion - Realistic head and body movements
  • Multi-person support - Handle multiple speakers in one video
  • Enhanced stability - Reduced distortions compared to MultiTalk

Perfect for content creators, educators, marketers, and developers who need professional talking avatars.

0:00
/0:20

If you thought only big studios could achieve this kind of realism, prepare to be amazed—InfiniteTalk Video to Video hands you the power to let your portraits do the talking!

Download Workflow

Installation guide

  1. Download the workflow file
  2. Open ComfyUI (local or ThinkDiffusion)
  3. Drag the workflow file into the ComfyUI window
  4. If you see red nodes, install missing components:
  • ComfyUI Manager > Install Missing Custom Nodes

Verified to work on ThinkDiffusion Build: August 21, 2025

ComfyUI v0.3.50 with the use Wan2_1-InfiniTetalk-Single_fp16.safetensors and wan2.1_i2V_480p_14B_fp16.safetensors models

Note: We specify the build date because ComfyUI and custom node versions updated after this date may change the behavior or outputs of the workflow.

Minimum Machine Size: Ultra

Use the specified machine size or higher to ensure it meets the VRAM and performance requirements of the workflow

💡
Download the workflow and drag & drop it into your ComfyUI window, whether locally or on ThinkDiffusion. If you're using ThinkDiffusion, minimum requirement is the Turbo 24gb machine, but we do recommend the Ultra 48gb machine.

Custom Nodes

If there are red nodes in the workflow, it means that the workflow lacks the certain required nodes. Install the custom nodes in order for the workflow to work.

  1. Go to the ComfyUI Manager  > Click Install Missing Custom Nodes
Latest in Lipsync: InfiniteTalk Video2Video ComfyUI Guide
  1. Check the list below if there's a list of custom nodes that needs to be installed and click the install.
Latest in Lipsync: InfiniteTalk Video2Video ComfyUI Guide

Required Models

For this guide you'll need to download these 6 recommended models.

1. lightx2v_I2V_14B_480p_cfg_step_distill_rank64_bf16.safetensors
2. Wan2_1-InfiniTetalk-Single_fp16.safetensors
3. clip_vision_h.safetensors
4. wan2.1_i2V_480p_14B_fp16.safetensors
5. Wan2_1_VAE_bf16.safetensors
6. TencentGameMate/chinese-wav2vec2-base
  1. Go to ComfyUI Manager  > Click Model Manager
Latest in Lipsync: InfiniteTalk Video2Video ComfyUI Guide
  1. Search for the models above and when you find the exact model that you're looking for, click install, and make sure to press refresh when you are finished.
Latest in Lipsync: InfiniteTalk Video2Video ComfyUI Guide

If Model Manager doesn't have them: Use direct download links (included with workflow) and upload through ThinkDiffusion MyFiles > Upload URL. Refer our docs for more guidance on this.

You could also use the model path source instead: by pasting the model's link address into ThinkDiffusion MyFiles using upload URL.

Model Name Model Link Address ThinkDiffusion Upload Directory
lightx2v_I2V_14B_480p_cfg_step_
distill_rank64_bf16.safetensors
📋 Copy Path
.../comfyui/models/lora/
Wan2_1-InfiniTetalk-Single_fp16.safetensors
📋 Copy Path
.../comfyui/models/diffusion_models/
clip_vision_h.safetensors
📋 Copy Path
.../comfyui/models/clip_vision/
wan2.1_i2V_480p_14B_fp16.safetensors
📋 Copy Path
.../comfyui/models/diffusion_models/
Wan2_1_VAE_bf16.safetensors
📋 Copy Path
.../comfyui/models/vae/
TencentGameMate/chinese-wav2vec2-base
Auto Download
Auto Download

Step-by-step Workflow Guide

This workflow was pretty easy to set up and runs well from the default settings. Here are a few steps where you might want to take extra note.

Steps Recommended Nodes
1. Set the Models

Set the models as seen on the image. If you got an out of memory, enable the low mem load and use the fp8 version of models.
Latest in Lipsync: InfiniteTalk Video2Video ComfyUI Guide
2. Set the Input

Set the input as seen on the image. If you got a higher machine than Ultra then you can set it to the 720p resolution.
Latest in Lipsync: InfiniteTalk Video2Video ComfyUI Guide
3. Load the Audio

Load an audio file and should be a high quality audio.
Latest in Lipsync: InfiniteTalk Video2Video ComfyUI Guide
4. Load the Video

Load a video. If the video is vertical then set to vertical input dimesion. Otherwise, use the horizontal settings.
Latest in Lipsync: InfiniteTalk Video2Video ComfyUI Guide
5. Write the Prompt

You can write only a simple prompt.
Latest in Lipsync: InfiniteTalk Video2Video ComfyUI Guide
6. Check the Sampling

Set the sampling as seen on the image.
Latest in Lipsync: InfiniteTalk Video2Video ComfyUI Guide
7. Check the Output

Latest in Lipsync: InfiniteTalk Video2Video ComfyUI Guide

Examples

0:00
/0:40
0:00
/0:20
0:00
/0:20
0:00
/0:40

Troubleshooting

Red Nodes: Install missing custom nodes through ComfyUI Manager
Out of Memory: Use smaller expansion factors or switch to Ultra machine
Poor Quality: Check input image resolution and adjust kontext strength
Visible Seams: Lower strength and ensure good prompt description

If you’re having issues with installation or slow hardware, you can try any of these workflows on a more powerful GPU in your browser with ThinkDiffusion.

Join the ThinkDiffusion Discord Server!
ThinkDiffusion is your Stable Diffusion workspace in the cloud with unrestricted, bleeding edge opensource AI art tools. | 5510 members
Latest in Lipsync: InfiniteTalk Video2Video ComfyUI Guide
]]>
<![CDATA[Uni3C: Copy Camera Motion from Any Video - Full Guide + Workflow]]>
0:00
/0:06

Prompt: A view of the forest in a upward camera view.

💡
Credits to the awesome Benji for this workflow.

Original Link - https://www.youtube.com/watch?v=b69Qs0wvaFE&t=311s

Uni3C is a ComfyUI model by Alibaba that converts static

]]>
https://learn.thinkdiffusion.com/uni3c-copy-camera-motion-from-any-video-full-guide-workflow/68859b53977a06000110c7f0Thu, 28 Aug 2025 16:08:55 GMT
0:00
/0:06
Uni3C: Copy Camera Motion from Any Video - Full Guide + Workflow

Prompt: A view of the forest in a upward camera view.

💡
Credits to the awesome Benji for this workflow.

Original Link - https://www.youtube.com/watch?v=b69Qs0wvaFE&t=311s

Uni3C is a ComfyUI model by Alibaba that converts static images into dynamic videos by transferring camera movements from reference videos. This tutorial covers complete setup and usage.

What is Uni3C?

0:00
/0:04

Source: The Uni3C

Uni3C is a unified 3D-enhanced framework integrated into ComfyUI that enables precise, simultaneous control over both camera motion and human animation within video generation workflows. By leveraging a lightweight plug-and-play control module, Uni3C extracts and transfers motion—such as camera movements and character actions—from reference videos directly onto new scenes or images, eliminating the need for complex manual rigging or joint annotation.

This technology is especially valuable for digital artists, animators, filmmakers, virtual avatar creators, educators, and anyone in content creation seeking to bring static visuals to life with realistic, controllable movement—all with the creative freedom and modular workflow of ComfyUI.

0:00
/0:06

Prompt: A woman in a man sitting and it follows the camera angle.

Whether you’re just starting out or looking to elevate your animation game, let’s explore together how a few simple steps can transform your scenes from static to spectacular!

Download Workflow

Installation guide

  1. Download the workflow file
  2. Open ComfyUI (local or ThinkDiffusion)
  3. Drag the workflow file into the ComfyUI window
  4. If you see red nodes, install missing components:
  • ComfyUI Manager > Install Missing Custom Nodes

Verified to work on ThinkDiffusion Build: June 27, 2025

ComfyUI v0.3.44 with the use Wan2_1-I2V-14B-720P_fp8_e4m3fn.safetensors and
Wan21_Uni3C_controlnet_fp16.safetensors

Note: We specify the build date because ComfyUI and custom node versions updated after this date may change the behavior or outputs of the workflow.

Minimum Machine Size: Ultra

Use the specified machine size or higher to ensure it meets the VRAM and performance requirements of the workflow

💡
Download the workflow and drag & drop it into your ComfyUI window, whether locally or on ThinkDiffusion. If you're using ThinkDiffusion, minimum requirement is the Turbo 24gb machine, but we do recommend the Ultra 48gb machine.

Custom Nodes

If there are red nodes in the workflow, it means that the workflow lacks the certain required nodes. Install the custom nodes in order for the workflow to work.

  1. Go to the ComfyUI Manager  > Click Install Missing Custom Nodes
Uni3C: Copy Camera Motion from Any Video - Full Guide + Workflow
  1. Check the list below if there's a list of custom nodes that needs to be installed and click the install.
Uni3C: Copy Camera Motion from Any Video - Full Guide + Workflow

Required Models

For this guide you'll need to download these 6 recommended models.

1. Wan2_1-I2V-14B-720P_fp8_e4m3fn.safetensors
2. umt5-xxl-fp16.safetensors
3. Wan21_T2V_14B_lightx2v_cfg_step_distill_lora_rank32.safetensors
4. clip_vision_h.safetensors
5. Wan2_1_VAE_bf16.safetensors
6. Wan21_Uni3C_controlnet_fp16.safetensors
  1. Go to ComfyUI Manager  > Click Model Manager
Uni3C: Copy Camera Motion from Any Video - Full Guide + Workflow
  1. Search for the models above and when you find the exact model that you're looking for, click install, and make sure to press refresh when you are finished.
Uni3C: Copy Camera Motion from Any Video - Full Guide + Workflow

If Model Manager doesn't have them: Use direct download links (included with workflow) and upload through ThinkDiffusion MyFiles > Upload URL. Refer our docs for more guidance on this.

You could also use the model path source instead: by pasting the model's link address into ThinkDiffusion MyFiles using upload URL.

Model Name Model Link Address ThinkDiffusion Upload Directory
Wan2_1-I2V-14B-720P_fp8_e4m3fn.safetensors
📋 Copy Path
.../comfyui/models/diffusion_models/
umt5-xxl-fp16.safetensors
📋 Copy Path
.../comfyui/models/text_encoders/
Wan21_T2V_14B_lightx2v_cfg_step_
distill_lora_rank32.safetensors
📋 Copy Path
.../comfyui/models/loras/
clip_vision_h.safetensors
📋 Copy Path
.../comfyui/models/clip/
Wan2_1_VAE_bf16.safetensors
📋 Copy Path
.../comfyui/models/vae/
Wan21_Uni3C_controlnet_fp16.safetensors
📋 Copy Path
.../comfyui/models/diffusion_models/

Step-by-step Workflow Guide

This workflow was pretty easy to set up and runs well from the default settings. Here are a few steps where you might want to take extra note.

Steps Recommended Nodes
1. Set the Height, Weight and Frames

Set the frames up to 125. You can set the resolution up to 720 or 1080.
Uni3C: Copy Camera Motion from Any Video - Full Guide + Workflow
2. Load an Input Video for Control Reference

Load an any type of video. The video must have a scene of camera movement.
Uni3C: Copy Camera Motion from Any Video - Full Guide + Workflow
3. Load an Input Image

Upload an image which serves as the base image for the generation. It works with no subject or even in multiple subject.
Uni3C: Copy Camera Motion from Any Video - Full Guide + Workflow
4. Set the Models

Set the models as seen on the image.
Uni3C: Copy Camera Motion from Any Video - Full Guide + Workflow
5. Write Prompt and Check Sampling

Write a simple prompt which support the kind of camera or depth movement. Check the sampling settings as seen on the image.
Uni3C: Copy Camera Motion from Any Video - Full Guide + Workflow
6. Check the Output

Uni3C: Copy Camera Motion from Any Video - Full Guide + Workflow
💡
The initial generation isn’t always perfect, so I often need to review the outputs carefully and select the best one. Sometimes, if the results don’t meet my expectations, I find it necessary to rerun the prompt to achieve the desired outcome. This iterative approach helps ensure that I consistently produce high-quality and visually appealing results, even if it takes a few attempts to get everything just right.
💡
In my experience, this method may struggle to capture very small details in the input image. I’ve found it essential to start with a clear, high-quality image that’s taken from a close, direct viewpoint. Using such input not only helps preserve important features but also leads to more accurate and visually appealing results in the final output.

Examples

0:00
/0:06

Prompt: A cat at the grass and follows the camera view.

0:00
/0:06

Prompt: A view of the living and follows the camera view.

0:00
/0:06

Prompt: A view of a woman reading at the table.


Troubleshooting

Red Nodes: Install missing custom nodes through ComfyUI Manager
Out of Memory: Use smaller expansion factors or switch to Ultra machine
Poor Quality: Check input image resolution and adjust kontext strength
Visible Seams: Lower strength and ensure good prompt description

If you’re having issues with installation or slow hardware, you can try any of these workflows on a more powerful GPU in your browser with ThinkDiffusion.

]]>
<![CDATA[Wan Start-End Frame with VACE and Flux Kontext - Complete Guide & Workflow]]>
0:00
/0:02

Prompt: A loving mother, dressed in a simple white outfit, gently lifts her baby - also clothed in white - into the air against a soft, warm background. As the animation progresses, the scene smoothly transitions: the mother brings the baby down into a

]]>
https://learn.thinkdiffusion.com/wan-start-end-frame-with-vace-and-flux-kontext-complete-guide-workflow/687a349551ec4500015ba8ddTue, 26 Aug 2025 16:57:25 GMT
0:00
/0:02
Wan Start-End Frame with VACE and Flux Kontext - Complete Guide & Workflow

Prompt: A loving mother, dressed in a simple white outfit, gently lifts her baby - also clothed in white - into the air against a soft, warm background. As the animation progresses, the scene smoothly transitions: the mother brings the baby down into a gentle embrace, ending with both of them sharing a heartfelt hug. Throughout the sequence, their expressions remain tender and natural, with the mother and baby’s white clothing staying crisp and unchanged. The background, atmosphere, and lighting remain consistent, with a seamless, natural transformation between lifting and hugging poses. 

What This Workflow Does

This ComfyUI workflow creates smooth animations by:

  • Taking your starting image
  • Generating an end frame with AI
  • Creating seamless transitions between both frames
  • Maintaining consistent subjects and backgrounds throughout
💡
Credits to the awesome TheArtOfficial for this workflow.
Original Link: https://www.youtube.com/watch?v=hB7dSagdLS8

This ComfyUI workflow creates smooth animations by taking your starting image, generating an AI-powered end frame, and creating seamless transitions between both frames while maintaining consistent subjects and backgrounds throughout. The result is professional-quality animations with cinematic transitions, subject consistency across all frames, and context-aware scene evolution without jarring cuts or morphing artifacts.

Think of it as directing a movie scene where you define the beginning and ending poses, and the AI fills in all the natural movement between them. Whether you want a mother lifting her baby into an embrace, a flower blooming to reveal something magical, or a peaceful landscape transforming dramatically, this workflow handles the complex interpolation while keeping everything visually coherent.

Why is VACE and Flux Kontext has the advantage?

0:00
/0:02

Prompt: A windswept grassy field stretches into the distance, blades of grass bending gently in the breeze beneath a partly cloudy sky. As the animation progresses, a girl stands amidst the tall grass—at first with her back to the camera or positioned away. The wind tousles her hair and clothing as she slowly turns her body to face the viewer, her expression brightening into a happy, genuine smile. The final moment captures her smiling directly at the camera, radiating joy and warmth, while the lively movement of grass and hair continues throughout. 

Integrating VACE - a powerful, unified model that brings precision, subject consistency, and cinematic quality to each frame, enabling coherent background changes and nuanced transformations. Including Flux Kontext, a context-aware image editing model, further refines this process by allowing detailed, prompt-driven modifications and maintaining object integrity or stylistic adjustments from the start to the end frame.

By combining VACE and Flux Kontext in the Wan Start-End Frame workflow, users gain the ability to craft sophisticated, professional-grade animations with smooth transitions, reliable subject identity, and creatively controlled scene evolution that simply isn’t possible with basic interpolation alone.

Let’s the journey toward truly seamless storytelling.

Download Workflow

Getting started requires downloading the workflow file and setting up the necessary components. First, download the workflow file and open ComfyUI, either locally or through ThinkDiffusion. Simply drag the workflow file into the ComfyUI window to load it.

Installation guide

  1. Download the workflow file
  2. Open ComfyUI (local or ThinkDiffusion)
  3. Drag the workflow file into the ComfyUI window
  4. If you see red nodes, install missing components:
  • ComfyUI Manager > Install Missing Custom Nodes

Verified to work on ThinkDiffusion Build: June 27, 2025

ComfyUI v0.3.44 with the use flux1-kontext-dev.safetensors, wan2.1_t2v_14B_fp8_e4m3fn.safetensors, and Wan2_1-VACE_module_14B_fp8_e4m3fn.safetensors

Note: We specify the build date because ComfyUI and custom node versions updated after this date may change the behavior or outputs of the workflow.

Minimum Machine Size: Ultra

Use the specified machine size or higher to ensure it meets the VRAM and performance requirements of the workflow

💡
Download the workflow and drag & drop it into your ComfyUI window, whether locally or on ThinkDiffusion. If you're using ThinkDiffusion, minimum requirement is the Turbo 24gb machine, but we do recommend the Ultra 48gb machine.

Custom Nodes

If there are red nodes in the workflow, it means that the workflow lacks the certain required nodes. Install the custom nodes in order for the workflow to work.

  1. Go to the ComfyUI Manager  > Click Install Missing Custom Nodes
Wan Start-End Frame with VACE and Flux Kontext - Complete Guide & Workflow
  1. Check the list below if there's a list of custom nodes that needs to be installed and click the install.
Wan Start-End Frame with VACE and Flux Kontext - Complete Guide & Workflow

Required Models

For this guide you'll need to download these 9 recommended models.

1. flux1-kontext-dev.safetensors
2. clip_l.safetensors
3. t5xxl_fp8_e4m3fn.safetensors
4. ae.safetensors
5. Wan2_1_VAE_bf16.safetensors
6. umt5_xxl_fp8_e4m3fn_scaled.safetensors
7. wan2.1_t2v_14B_fp8_e4m3fn.safetensors
8. Wan21_T2V_14B_lightx2v_cfg_distill_lora_rank32.safetensors
9. Wan2_1-VACE_module_14B_fp8_e4m3fn.safetensors
  1. Go to ComfyUI Manager  > Click Model Manager
Wan Start-End Frame with VACE and Flux Kontext - Complete Guide & Workflow
  1. Search for the models above and when you find the exact model that you're looking for, click install, and make sure to press refresh when you are finished.
Wan Start-End Frame with VACE and Flux Kontext - Complete Guide & Workflow

If Model Manager doesn't have them: Use direct download links (included with workflow) and upload through ThinkDiffusion MyFiles > Upload URL. Refer our docs for more guidance on this.

You could also use the model path source instead: by pasting the model's link address into ThinkDiffusion MyFiles using upload URL.

Model Name Model Link Address ThinkDiffusion Upload Directory
flux1-kontext-dev.safetensors
📋 Copy Path
.../comfyui/models/diffusion_models/
clip_l.safetensors
📋 Copy Path
.../comfyui/models/clip/
t5xxl_fp8_e4m3fn.safetensors
📋 Copy Path
.../comfyui/models/text_encoders/
ae.safetensors
📋 Copy Path
.../comfyui/models/vae/
Wan2_1_VAE_bf16.safetensors
📋 Copy Path
.../comfyui/models/vae/
umt5_xxl_fp8_e4m3fn_scaled.safetensors
📋 Copy Path
.../comfyui/models/text_encoders/
wan2.1_t2v_14B_fp8_e4m3fn.safetensors
📋 Copy Path
.../comfyui/models/diffusion_models/
Wan21_T2V_14B_lightx2v_cfg_
distill_lora_rank32.safetensors
📋 Copy Path
.../comfyui/models/lora/
Wan2_1-VACE_module_14B_fp8_e4m3fn.safetensors
📋 Copy Path
.../comfyui/models/diffusion_models/

Step-by-step Workflow Guide

This workflow was pretty easy to set up and runs well from the default settings. Here are a few steps where you might want to take extra note.

Steps Recommended Nodes
1. Load First Frame Image

Input your image as the first frame. You can choose either horizontal scale or vertical image.
Wan Start-End Frame with VACE and Flux Kontext - Complete Guide & Workflow
2. Generate End Frame Image

Set your end frame image by generating it using the flux kontext prompt.
Wan Start-End Frame with VACE and Flux Kontext - Complete Guide & Workflow
3. Set Models

Set the models as seen on the image. If you experience an out of memory reduced the weight more to fp8.
Wan Start-End Frame with VACE and Flux Kontext - Complete Guide & Workflow
4. Write Prompt for First and End Frame

Write a complete prompt here for your Start and End frame. It should be detailed and describes how does animation looks from the start to end.
Wan Start-End Frame with VACE and Flux Kontext - Complete Guide & Workflow
5. Check Sampling

Set the sampling settings as seen on the image.
Wan Start-End Frame with VACE and Flux Kontext - Complete Guide & Workflow
6. Check Output

Wan Start-End Frame with VACE and Flux Kontext - Complete Guide & Workflow
💡
I find the workflow quite demanding, especially since it already employs two samplers. To ensure it runs smoothly on my system, I typically work with a resolution of 480x832. Based on my experience, I strongly recommend using a machine with more than 48GB of VRAM for optimal performance when handling this workflow.
💡
The resolution I’m using may be too low for many users, but I address this by applying post-processing techniques to upscale the output in a separate workflow. This approach allows me to achieve higher quality results, even when starting with lower-resolution images.

EXAMPLES

0:00
/0:02

Prompt: An empty living room, neatly arranged and softly illuminated by natural light, sits quiet and undisturbed. As the animation progresses, a man enters the scene from off-frame, walks purposefully toward the television across the room, and reaches out to switch it on. The transition is smooth, showing each action clearly—the man's movement is gradual and natural, and the television’s screen comes to life in the final moment. Throughout the sequence, all background elements, lighting, and room arrangement stay consistent, with the only changes being the man’s presence and the activation of the television. No abrupt shifts, morphing, or object distortion—maintain environment and subject clarity at all times.

0:00
/0:02

Prompt: A delicate yellow flower bud stands upright against a soft, green background, its petals gently wrapped and untouched. As the animation progresses, the bud slowly opens and blossoms into a fully bloomed yellow flower, each petal unfurling with smooth, natural motion. At the heart of the blooming flower, a tiny, peaceful baby appears asleep—nestled comfortably at the center, with serene features and a gentle posture. The background remains undisturbed throughout the sequence, focusing all changes on the opening of the bud and the baby’s gentle emergence within the flower’s center. Maintain vibrant yellow tones, realistic textures, and sharp focus on both the flower and the sleeping baby, creating a tranquil and dreamlike atmosphere.

0:00
/0:04

Prompt: A breathtaking view of Mount Fuji rises majestically above lush forests and a crystal-clear lake, set beneath bright blue skies with gentle daylight. As the animation progresses, the serene scene transforms: Mount Fuji suddenly erupts, sending torrents of fiery magma and ash into the air. Bright lava bursts from the crater, smoke billows, and the once-clear skies grow dark and turbulent, filled with swirling ash clouds. The mountain’s distinct shape and the surrounding landscape remain consistent, with only the dramatic volcanic eruption and atmospheric changes unfolding. Ensure the transition from tranquil beauty to eruption is smooth and cinematic, with natural motion in the flow of lava, spreading smoke, and gradual darkening of the environment. 


Troubleshooting

Red Nodes: Install missing custom nodes through ComfyUI Manager
Out of Memory: Use smaller expansion factors or switch to Ultra machine
Poor Quality: Check input image resolution and adjust kontext strength
Visible Seams: Lower strength and ensure good prompt description

If you’re having issues with installation or slow hardware, you can try any of these workflows on a more powerful GPU in your browser with ThinkDiffusion.

If you're having issues with workflow and visit us here at Discord #Help Desk or you may opt to email us at [email protected]

Join the ThinkDiffusion Discord Server!
ThinkDiffusion is your Stable Diffusion workspace in the cloud with unrestricted, bleeding edge opensource AI art tools. | 5510 members
Wan Start-End Frame with VACE and Flux Kontext - Complete Guide & Workflow

]]>
<![CDATA[Transform Any Portrait Into a Talking Character: Wan MultiTalk Image-to-Video Guide]]>
0:00
/0:12

Transform static portraits into realistic talking videos with perfect lip-sync using MultiTalk AI. No coding required.

Difficulty: Beginner-friendly
Setup Time: 15 minutes

What You'll Create

Turn any portrait - artwork, photos, or digital characters - into speaking, expressive videos that sync perfectly

]]>
https://learn.thinkdiffusion.com/transform-any-portrait-into-a-talking-character-wan-multitalk-image-to-video-guide/686fb42af92f2900013f9b4eMon, 28 Jul 2025 09:59:21 GMT
0:00
/0:12
Transform Any Portrait Into a Talking Character: Wan MultiTalk Image-to-Video Guide

Transform static portraits into realistic talking videos with perfect lip-sync using MultiTalk AI. No coding required.

Difficulty: Beginner-friendly
Setup Time: 15 minutes

What You'll Create

Turn any portrait - artwork, photos, or digital characters - into speaking, expressive videos that sync perfectly with audio input. MultiTalk handles lip movements, facial expressions, and body motion automatically.

Example Results:

  • Portrait paintings that recite poetry
  • Character artwork that delivers dialogue
  • Profile photos that sing songs
  • Multiple characters having conversations

In this tutorial, you’ll discover a surprisingly easy way to bridge the gap between static images and expressive animation. Whether you’re looking to enhance your social media posts, create memorable content for friends, or explore new storytelling techniques, this guide will open the door to a world where your characters can truly interact and entertain.

What is MultiTalk?

0:00
/0:12

MultiTalk is an open-source AI framework that converts static images into realistic talking videos using audio input. Built by MeiGen AI, it accurately syncs lip movements and facial expressions to speech or singing, supporting both single and multi-person scenes.

With support for single or multi-person scenes, text prompts for emotion and behavior control, and compatibility with real or stylized characters, MultiTalk offers incredible creative flexibility. Integrated into ComfyUI and optimized for fast performance, it’s ideal for digital artists, content creators, educators, and developers who want to bring portraits, avatars, or original characters to life in seconds.

MultiTalk Framework

Transform Any Portrait Into a Talking Character: Wan MultiTalk Image-to-Video Guide
Source: MultiTalk
💡
In this work, I present MultiTalk, an audio-driven video generation framework capable of creating realistic talking-head animations from speech input. It enables the injection of multiple audio streams simultaneously. It also integrates improved audio cross-attention layer to better align speech features with visual motion, resulting in more natural, expressive video generation.

Let’s explore how you can turn imagination into motion, and watch as your creative visions become animated realities! The possibilities are as limitless as your imagination, so let’s get started and see where your characters can take you next!

Download Workflow

Installation guide

  1. Download the workflow file
  2. Open ComfyUI (local or ThinkDiffusion)
  3. Drag the workflow file into the ComfyUI window
  4. If you see red nodes, install missing components:
  • ComfyUI Manager > Install Missing Custom Nodes
💡
Update 08/04/2025 - Workflow updated to its compatible version

Verified to work on ThinkDiffusion Build: June 27, 2025

ComfyUI v0.3.44 with the use Wan14Bi2vFusionX.safetensors and multitalk.safetensors

Note: We specify the build date because ComfyUI and custom node versions updated after this date may change the behavior or outputs of the workflow.

Minimum Machine Size: Ultra

Use the specified machine size or higher to ensure it meets the VRAM and performance requirements of the workflow

💡
Download the workflow and drag & drop it into your ComfyUI window, whether locally or on ThinkDiffusion. If you're using ThinkDiffusion, minimum requirement is the Turbo 24gb machine, but we do recommend the Ultra 48gb machine.

Custom Nodes

If there are red nodes in the workflow, it means that the workflow lacks the certain required nodes. Install the custom nodes in order for the workflow to work.

  1. Go to the ComfyUI Manager  > Click Install Missing Custom Nodes
Transform Any Portrait Into a Talking Character: Wan MultiTalk Image-to-Video Guide
  1. Check the list below if there's a list of custom nodes that needs to be installed and click the install.
Transform Any Portrait Into a Talking Character: Wan MultiTalk Image-to-Video Guide

Required Models

For this guide you'll need to download these 8 recommended models.

1. detailz-wan.safetensors
2. Wan21_T2V_14B_lightx2v_cfg_step_distill_lora_rank32.safetensors
3. Wan14Bi2vFusionX.safetensors
4. clip_vision_h.safetensors
5. Wan2_1_VAE_bf16.safetensors
6. umt5-xxl-enc-bf16.safetensors
7. multitalk.safetensors
8. TencentGameMate/chinese-wav2vec2-base
  1. Go to ComfyUI Manager  > Click Model Manager
Transform Any Portrait Into a Talking Character: Wan MultiTalk Image-to-Video Guide
  1. Search for the models above and when you find the exact model that you're looking for, click install, and make sure to press refresh when you are finished.
Transform Any Portrait Into a Talking Character: Wan MultiTalk Image-to-Video Guide

If Model Manager doesn't have them: Use direct download links (included with workflow) and upload through ThinkDiffusion MyFiles > Upload URL. Refer our docs for more guidance on this.

You could also use the model path source instead: by pasting the model's link address into ThinkDiffusion MyFiles using upload URL.

Model Name Model Link Address ThinkDiffusion Upload Directory
detailz-wan.safetensors
📋 Copy Path
.../comfyui/models/loras/
Wan21_T2V_14B_lightx2v_cfg_step_distill_
lora_rank32.safetensors
📋 Copy Path
.../comfyui/models/loras/
Wan14Bi2vFusionX.safetensors
📋 Copy Path
.../comfyui/models/diffusion_models/
clip_vision_h.safetensors
📋 Copy Path
.../comfyui/models/clip/
Wan2_1_VAE_bf16.safetensors
📋 Copy Path
.../comfyui/models/vae/
umt5-xxl-enc-bf16.safetensors
📋 Copy Path
.../comfyui/models/text_encoders/
WanVideo_2_1_Multitalk_14B_fp8_e4m3fn.safetensors
📋 Copy Path
.../comfyui/models/diffusion_models/
TencentGameMate/chinese-wav2vec2-base
Auto Download
Auto Upload
💡
If I want to achieve higher quality in my generated videos, I use the full multitalk.safetensors model. However, based on its advanced architecture and resource demands, I make sure my machine is equipped with at least 48GB of RAM and 64GB of VRAM to ensure smooth performance and optimal results. This setup allows me to take full advantage of the model’s capabilities when producing high-resolution, detailed video content.

Step-by-step Workflow Guide

This workflow was pretty easy to set up and runs well from the default settings. Here are a few steps where you might want to take extra note.

Steps Recommended Nodes
1. Load an Image

Load an input image. It should be clear and high in quality.
Transform Any Portrait Into a Talking Character: Wan MultiTalk Image-to-Video Guide
2. Set the Models

Set the models as seen on the image. Don't change any setting because it may lead to out of memory. This workflow settings is already on the edge.
Transform Any Portrait Into a Talking Character: Wan MultiTalk Image-to-Video Guide
3. Write a Prompt

Write a prompt. Include a word detailz, this is a trigger word for lora.
Transform Any Portrait Into a Talking Character: Wan MultiTalk Image-to-Video Guide
4. Check Sampling

Check the sampling the settings. Set the steps count as it is because it uses a lora. Set the setting as seen on the image.
Transform Any Portrait Into a Talking Character: Wan MultiTalk Image-to-Video Guide
5. Check Output

Transform Any Portrait Into a Talking Character: Wan MultiTalk Image-to-Video Guide

Examples

0:00
/0:12
0:00
/0:11
0:00
/0:12
0:00
/0:12
0:00
/0:12

Troubleshooting

Red Nodes: Install missing custom nodes through ComfyUI Manager
Out of Memory: Use smaller expansion factors or switch to Ultra machine
Poor Quality: Check input image resolution and adjust kontext strength
Visible Seams: Lower strength and ensure good prompt description

If you’re having issues with installation or slow hardware, you can try any of these workflows on a more powerful GPU in your browser with ThinkDiffusion.

]]>
<![CDATA[How to Use Flux Kontext for Image-to-Panorama 3D in ComfyUI]]>
0:00
/0:31

Flux Kontext expands your images into panoramic views directly in ComfyUI. Instead of cropping or stretching, it intelligently generates new content that extends beyond your image borders, creating seamless panoramic scenes.

What You'll Get

This workflow takes a standard image and generates

]]>
https://learn.thinkdiffusion.com/how-to-use-flux-kontext-for-image-to-panorama-3d-in-comfyui/686d15a61b332100013ff5f3Thu, 10 Jul 2025 13:44:07 GMT
0:00
/0:31
How to Use Flux Kontext for Image-to-Panorama 3D in ComfyUI

Flux Kontext expands your images into panoramic views directly in ComfyUI. Instead of cropping or stretching, it intelligently generates new content that extends beyond your image borders, creating seamless panoramic scenes.

What You'll Get

This workflow takes a standard image and generates extended panoramic versions by:

  • Analyzing your input image's context and style
  • Generating new content that naturally extends the scene
  • Creating horizontal panoramas up to 3x the original width
  • Maintaining consistent lighting, perspective, and artistic style

Best for: Landscapes, cityscapes, interior shots, and any scene where you want to reveal "what's beyond the frame."

What is Image-to-Panorama?

0:00
/0:31

A panorama is an image that captures a much wider field of view than a standard photograph, often stretching across an entire landscape or environment to reveal far more than what a single frame can show. Traditionally created by stitching together multiple overlapping photos, panoramas provide a seamless, immersive scene that can be horizontal, vertical, or even a full 360-degree view. In image generation, panoramas are especially useful because they allow creators to expand the context and storytelling potential of a single image, offering a broader perspective and more detail. This capability is valuable for photographers and artists seeking to produce striking, high-resolution visuals, as well as for architects and real estate professionals who need to showcase spaces in a comprehensive, interactive way.

0:00
/0:31

Marketers, educators, and those in travel or tourism can also benefit, using image-to-panorama and 3D view features to create virtual tours, interactive content, and engaging educational materials. Ultimately, panoramas and 3D views transform ordinary images into immersive experiences, making them more informative, captivating, and useful for a wide range of creative and professional applications.

Get ready to see your pictures in a whole new light!

Download Workflow

Installation guide

  1. Download the workflow file
  2. Open ComfyUI (local or ThinkDiffusion)
  3. Drag the workflow file into the ComfyUI window
  4. If you see red nodes, install missing components:
  • ComfyUI Manager > Install Missing Custom Nodes
💡
Credits to the awesome Dennis Schöneberg for this workflow.

Original Civitai Link: https://civitai.com/models/682349/360-degree-flux-and-kontext

Verified to work on ThinkDiffusion Build: June 27, 2025

ComfyUI v0.3.42 with the use flux1-kontext-dev.safetensors

Note: We specify the build date because ComfyUI and custom node versions updated after this date may change the behavior or outputs of the workflow.

Minimum Machine Size: Ultra

Use the specified machine size or higher to ensure it meets the VRAM and performance requirements of the workflow

💡
Download the workflow and drag & drop it into your ComfyUI window, whether locally or on ThinkDiffusion. If you're using ThinkDiffusion, minimum requirement is the Turbo 24gb machine, but we do recommend the Ultra 48gb machine.

Custom Nodes

If there are red nodes in the workflow, it means that the workflow lacks the certain required nodes. Install the custom nodes in order for the workflow to work.

  1. Go to the ComfyUI Manager  > Click Install Missing Custom Nodes
How to Use Flux Kontext for Image-to-Panorama 3D in ComfyUI
  1. Check the list below if there's a list of custom nodes that needs to be installed and click the install.
How to Use Flux Kontext for Image-to-Panorama 3D in ComfyUI

Required Models

For this guide you'll need to download these 8 recommended models.

1. flux1-kontext-dev.safetensors
2. clip_l.safetensors
3. t5xxl_fp16.safetensors
4. ae.safetensors
5. alimama-creative-FLUX1-Turbo-Alpha.safetensors
6. HDR360.safetensors
7. 4x-ClearRealityV1.pth
8. Florence-2-base
  1. Go to ComfyUI Manager  > Click Model Manager
How to Use Flux Kontext for Image-to-Panorama 3D in ComfyUI
  1. Search for the models above and when you find the exact model that you're looking for, click install, and make sure to press refresh when you are finished.
How to Use Flux Kontext for Image-to-Panorama 3D in ComfyUI

If Model Manager doesn't have them: Use direct download links (included with workflow) and upload through ThinkDiffusion MyFiles > Upload URL. Refer our docs for more guidance on this.

You could also use the model path source instead: by pasting the model's link address into ThinkDiffusion MyFiles using upload URL.

Model Name Model Link Address ThinkDiffusion Upload Directory
flux1-kontext-dev.safetensors
📋 Copy Path
.../comfyui/models/diffusion_models/
clip_l.safetensors
📋 Copy Path
.../comfyui/models/clip/
t5xxl_fp16.safetensors
📋 Copy Path
.../comfyui/models/text_encoders/
ae.safetensors
📋 Copy Path
.../comfyui/models/vae/
alimama-creative-FLUX1-Turbo-Alpha.safetensors
📋 Copy Path
.../comfyui/models/loras/
HDR360.safetensors
📋 Copy Path
.../comfyui/models/loras/
4x-ClearRealityV1.pth
📋 Copy Path
.../comfyui/models/upscale_models/
Florence-2-base
Auto Download
Auto Upload
💡
If you encounter this model diffusion_pytorch_model.safetensors after you upload the model for alimama-creative-FLUX1-Turbo-Alpha.safetensors, just renamed to its exact name.

Step-by-step Workflow Guide

This workflow was pretty easy to set up and runs well from the default settings. Here are a few steps where you might want to take extra note.

Steps Recommended Nodes
1. Load an Image

Load an image that is good for 3D panorama view somthing like surrounding, landscape, interior, etc.
How to Use Flux Kontext for Image-to-Panorama 3D in ComfyUI
2. Set the Models

Set the models as seen on the image.
How to Use Flux Kontext for Image-to-Panorama 3D in ComfyUI
3. Check Prompt

Check the prompt. You don't need to write anymore.
How to Use Flux Kontext for Image-to-Panorama 3D in ComfyUI
4. Check Sampling

Check the sampling settings it should be the same as seen on the image.
How to Use Flux Kontext for Image-to-Panorama 3D in ComfyUI
5. Check Crop and Panorama Settings

Don't change the settings for crop and other settings.
How to Use Flux Kontext for Image-to-Panorama 3D in ComfyUI
6. Check Upscale

Check the upscale settings. It should x2 only for upscale. Otherwise the workflow will crash.
How to Use Flux Kontext for Image-to-Panorama 3D in ComfyUI
7. Check Output

How to Use Flux Kontext for Image-to-Panorama 3D in ComfyUI
💡
From my perspective, the seamless quality of these panoramas truly stands out. I’ve found them to be ideal for VR environments, immersive displays, and any project where a genuine 360-degree visual experience is essential. The absence of jarring transitions or visible seams significantly enhances the sense of immersion, making the visuals feel smooth and uninterrupted. This level of quality has made a noticeable difference in my work, especially when aiming to create captivating and realistic virtual experiences.
💡
When preparing to load an image as input, I always ensure that the image is not captured from a 3D perspective or already in an isometric view. Using such angles can cause the generated panorama to appear distorted or misaligned. Instead, I select images taken from a direct, head-on angle, as this provides the most accurate and seamless results. Additionally, I prioritize high-quality images to maintain clarity and detail throughout the panorama generation process. This careful selection helps me achieve visually consistent and professional panoramic outputs.
💡
When I want to achieve a noticeable boost in both image quality and fine detail, I make it a point to apply tiled diffusion upscaling. This technique allows me to enhance the resolution and sharpness of the final output, ensuring that even the smallest features are rendered with impressive clarity. 

Examples

0:00
/0:32
0:00
/0:23
0:00
/0:45

Troubleshooting

Red Nodes: Install missing custom nodes through ComfyUI Manager
Out of Memory: Use smaller expansion factors or switch to Ultra machine
Poor Quality: Check input image resolution and adjust kontext strength
Visible Seams: Lower strength and ensure good prompt description

If you’re having issues with installation or slow hardware, you can try any of these workflows on a more powerful GPU in your browser with ThinkDiffusion.

]]>
<![CDATA[Make Your Character & Style LoRA Stand Out - Easy LoRA Training with FluxGym!]]>

Want to create custom AI image models but find the process intimidating? This guide shows you how to train your own LoRA models using FluxGym - no coding experience required.

Whether you want to generate images in a specific art style, create consistent characters, or adapt AI models for your

]]>
https://learn.thinkdiffusion.com/make-your-character-style-lora-stand-out-easy-lora-training-with-fluxgym/6826d355ba14220001ac324bTue, 08 Jul 2025 06:12:07 GMTMake Your Character & Style LoRA Stand Out - Easy LoRA Training with FluxGym!Make Your Character & Style LoRA Stand Out - Easy LoRA Training with FluxGym!

Want to create custom AI image models but find the process intimidating? This guide shows you how to train your own LoRA models using FluxGym - no coding experience required.

Whether you want to generate images in a specific art style, create consistent characters, or adapt AI models for your unique needs, you'll learn everything you need to know in about an hour.

Let’s dive in and unleash your creative potential with FluxGym!

What is LoRA?

Make Your Character & Style LoRA Stand Out - Easy LoRA Training with FluxGym!

LoRA, or Low-Rank Adaptation, is a technique in artificial intelligence that enables efficient and targeted fine-tuning of large pre-trained models, such as those used in image generation or language processing-without the need to retrain or modify the entire model.

Think of it as teaching an existing AI model new tricks—like recognizing a specific art style or character—while keeping all its original knowledge intact.

This makes LoRA perfect for:

  • Creating images in a particular artistic style
  • Generating consistent characters for stories or projects
  • Adapting AI models for specific visual themes
  • Customizing outputs without massive computing resources
💡
It is especially useful in AI generation because it allows creators to quickly and efficiently tailor models for specific purposes-such as generating images in a particular style, capturing the likeness of a character, or adapting to a new domain-while preserving the general knowledge and capabilities of the original model. This makes LoRA a powerful tool for artists, developers, and businesses seeking flexible, scalable, and cost-effective ways to adapt AI models to their unique needs.


What is FluxGym?

Make Your Character & Style LoRA Stand Out - Easy LoRA Training with FluxGym!

FluxGym is an open-source, web-based tool designed to make training Flux LoRA models accessible and straightforward, especially for users with limited hardware resources. Created by the developer behind Pinokio, it stands out for its user-friendly Gradio interface.

Key features:

  • Easy-to-use web interface
  • Real-time training progress tracking
  • Automatic sample image generation
  • Works with limited hardware resources
  • No command line or coding required
💡
Key features include an intuitive web UI for configuring training parameters, real-time of training progress, automated sample image generation, and support for custom base models and advanced training options.

Its simplicity, flexibility, and low hardware demands make FluxGym a standout choice for both beginners and experienced users seeking to train custom Flux LoRA models.


How to Train a Flux LoRa using FluxGym?

Make Your Character & Style LoRA Stand Out - Easy LoRA Training with FluxGym!
💡
Collecting images for LoRA datasets requires different approaches for every types of LoRA that you want to train.

The first part is preparing your dataset.

Tips for Datasets

  • Use short, descriptive filenames for your images.
  • Use high-quality, sharp images (ideally 1024x1024) with the subject clearly visible and centered.
  • Maintain consistency in style, lighting, and subject focus across your dataset.
  • Include variety by showing the subject in different poses, angles, and backgrounds to help the model generalize.
  • Crop to a square aspect ratio and organize images in a clearly named folder.
  • Write accurate captions for each image, describing the subject and style; review auto-generated captions for accuracy.
  • Aim for quality over quantity-20–30 well-chosen images are often better than a larger, inconsistent set.

What you'll need?

- 20-30 high-quality images (1024x1024 recommended)
- Access to FluxGym (via ThinkDiffusion or local install)
- About 2 hours+ for training (depends with no. of dataset)
- Basic understanding of your creative goal

Procedures

Step 1

Make Your Character & Style LoRA Stand Out - Easy LoRA Training with FluxGym!
Step 1 Procedures:
1. Enter the name for your LoRA
2. Write a trigger word and it should be unique.
3. Choose the base-model for training as flux-dev
4. Set the VRAM to 20GB
5. Set repeat training per image to 10
Make Your Character & Style LoRA Stand Out - Easy LoRA Training with FluxGym!
6. Set Train Epochs to 15
7. Training steps is auto computed
8. Write a Sample Image Prompts and includes a trigger on it
9. Resize dataset images set to 1024

Step 2

Make Your Character & Style LoRA Stand Out - Easy LoRA Training with FluxGym!
Step 2 Procedures:
1. Verify you have the correct number of images in your dataset
2. All files should be on PNG
Make Your Character & Style LoRA Stand Out - Easy LoRA Training with FluxGym!
3. Click the AI captions which can generate an auto caption. Captions in the images can help the training.

Step 3

Make Your Character & Style LoRA Stand Out - Easy LoRA Training with FluxGym!
Step 3 Procedures:
1. After setup the Step 1 and Step 2, you may click the Start Training
2. Remember you shouldn't edit the auto generated train script and this section is not editable.
3. Monitor the training in the training log below.
During the training, it may take time to process and it depends on the image dataset and training steps you had set.
4. After the training is completed. It will display in the training log below as completed.
5. Find your trained LoRa in .../fluxgym/output/<name of you lora folder>/

Examples of LoRA

Rodrigo Duterte (Character LoRA)

💡
Training Settings: 20 Image Dataset, 10 repeat images per train, 15 epochs, 3000 steps, Trigger word - Rodrigo_Duterte

Dan Mumford Art (Style LoRA)

💡
Training Settings: 30 Image Dataset, 10 repeat images per train, 10 epochs, 3000 steps, Trigger word - danmumford art

We would like to credit Dan Mumford for the concept and style of his art. It serves as inspiration for the training of our LoRA's dataset.

Sarah Duterte (Character LoRA)

💡
Training Settings: 20 Image Dataset, 10 repeat images per train, 15 epochs, 3000 steps, Trigger word - Sara_Duterte

Anato Finnstark Art (Style LoRA)

💡
Training Settings: 30 Image Dataset, 10 repeat images per train, 10 epochs, 3000 steps, Trigger word - dark_fantasy anato_finnstark

We would like to credit Anato Finnstark for the concept and style of his art. It serves as inspiration for the training of our LoRA's dataset.

💡
From my experience, I’ve found that there are many other types of LoRA models I can train beyond the usual examples, such as Object LoRA, Pose LoRA, Modifier LoRA, etc. Working with these specialized LoRA models gives me the flexibility to fine-tune outputs for specific objects, unique poses, or particular stylistic effects, depending on what my project needs. By experimenting with and integrating these different LoRA variants, I’m able to tailor my image generation process more precisely, which has been incredibly valuable for both creative and technical tasks within Stable Diffusion workflows.

Once you're comfortable with character and style LoRAs, you can experiment with:

  • Object LoRAs: For specific items or props
  • Pose LoRAs: For particular poses or compositions
  • Modifier LoRAs: For specific effects or modifications

Each type serves different creative needs and can be combined for even more precise control over your AI-generated images.


If you’re having issues with installation or slow hardware, you can try any of these workflows on a more powerful GPU in your browser with ThinkDiffusion.

If you're ready to try this yourself but need more powerful hardware, ThinkDiffusion offers FluxGym access through your browser with high-end GPUs.

The key to success is starting simple: pick one clear goal (like a specific art style), gather quality images, and follow the steps carefully. With a bit of practice, you'll be creating custom AI models that perfectly match your creative vision.

If you enjoy ComfyUI and you want to test out creating awesome animations, then feel free to check out this Flux LoRA Training using ComfyUI. And have fun out there with your noodles!

]]>
<![CDATA[Total Image Control with Flux Kontext: Complete Tutorial]]>https://learn.thinkdiffusion.com/total-image-control-with-flux-kontext-complete-tutorial/685e60aebf31340001f896d0Fri, 04 Jul 2025 01:55:25 GMTTotal Image Control with Flux Kontext: Complete Tutorial
Prompt: This man at the kitchen.
Total Image Control with Flux Kontext: Complete Tutorial

Flux Kontext lets you edit images precisely using text descriptions. Tell it what to change, and it modifies only those specific parts while keeping the rest intact. This is a gamechanger for control, and delivers incredible quality.

What is Flux Kontext?

Total Image Control with Flux Kontext: Complete Tutorial
Source: Flux Kontext

Flux Kontext is an AI image editing model by Black Forest Labs that excels at targeted modifications. Instead of generating entirely new images, it edits existing ones based on your text instructions.

Core capabilities:

  • Local editing: Change specific parts without affecting the whole image
  • Character consistency: Keep people looking the same across multiple edits
  • Style transfer: Apply artistic styles to existing images
  • Multi-round editing: Make several edits in sequence
  • Object manipulation: Add, remove, or modify objects

Available versions: Pro, Max, and Dev. This guide covers Flux1-kontext-dev, which is freely available under a non-commercial license

💡
This guide is dedicated to the exclusive use of the Flux1-kontext-dev only, a model weights openly available under a non-commercial license to break the trend of powerful image editing models being locked behind proprietary API.

Multi-Round Editing (Iterative)

Total Image Control with Flux Kontext: Complete Tutorial
Source: Flux Kontext
💡
Flux1 Dev Kontext enables you to perform multi-round image editing using just one input image, allowing you to apply a series of targeted edits while keeping the original style and details consistent. Each edit builds on the last, so you can refine or transform specific parts of your image step by step without losing quality or introducing inconsistencies. This fast, interactive workflow is ideal for artists and creators who want precise, context-aware control over their image editing process, making it easy to experiment and achieve complex results with ease.

How Multi-Round Editing Works

Flux Kontext's strength is iterative editing. You can:

  1. Start with one image
  2. Make a targeted edit
  3. Use the result for the next edit
  4. Continue refining step by step

Each edit builds on the previous one while maintaining visual consistency and quality.

Limitations of Flux Kontext

Total Image Control with Flux Kontext: Complete Tutorial
Illustration of a FLUX.1 Kontext failure case: After six iterative edits, the generation is visually degraded and contains visible artifacts. Source: Limitations of Flux Kontext
💡
In my experience with FLUX.1 Dev Kontext, I’ve noticed a few limitations in its current implementation. When I engage in extended, multi-turn editing sessions, I sometimes encounter visual artifacts that reduce the overall image quality. There have also been instances where the model doesn’t completely follow my instructions, occasionally missing specific details from my prompts. I’ve found that its limited world knowledge can make it challenging to generate content that is truly contextually accurate. Additionally, I’ve observed that the distillation process can introduce its own set of artifacts, which can impact the fidelity of the final output.
  • Quality degradation: After 6+ iterative edits, images may show artifacts
  • Instruction following: Sometimes misses specific prompt details
  • Limited world knowledge: May struggle with contextually accurate content
  • Distillation artifacts: Processing can introduce visual issues

    Get ready to unlock a smarter, sharper, and more creative approach to image generation—because your next masterpiece is just around the corner!

How to Use Flux Kontext in ComfyUI

Installation guide

  1. Download the workflow file
  2. Open ComfyUI (local or ThinkDiffusion)
  3. Drag the workflow file into the ComfyUI window
  4. If you see red nodes, install missing components:
    • ComfyUI Manager > Install Missing Custom Nodes

Verified to work on ThinkDiffusion Build: June 27, 2025

ComfyUI v0.3.42 with the use flux1-kontext-dev.safetensors

Note: We specify the build date because ComfyUI and custom node versions updated after this date may change the behavior or outputs of the workflow.

Minimum Machine Size: Ultra

Use the specified machine size or higher to ensure it meets the VRAM and performance requirements of the workflow

💡
Download the workflow and drag & drop it into your ComfyUI window, whether locally or on ThinkDiffusion. If you're using ThinkDiffusion, minimum requirement is the Turbo 24gb machine, but we do recommend the Ultra 48gb machine.

Custom Nodes

If there are red nodes in the workflow, it means that the workflow lacks the certain required nodes. Install the custom nodes in order for the workflow to work.

  1. Go to the ComfyUI Manager  > Click Install Missing Custom Nodes
Total Image Control with Flux Kontext: Complete Tutorial
  1. Check the list below if there's a list of custom nodes that needs to be installed and click the install.
Total Image Control with Flux Kontext: Complete Tutorial

Required Models

For this guide you'll need to download these 4 recommended models.

1. flux1-kontext-dev.safetensors
2. clip_l.safetensors
3. t5xxl_fp16.safetensors
4. ae.safetensors
  1. Go to ComfyUI Manager  > Click Model Manager
Total Image Control with Flux Kontext: Complete Tutorial
  1. Search for the models above and when you find the exact model that you're looking for, click install, and make sure to press refresh when you are finished.
Total Image Control with Flux Kontext: Complete Tutorial

If Model Manager doesn't have them: Use direct download links (included with workflow) and upload through ThinkDiffusion MyFiles > Upload URL. Refer our docs for more guidance on this.

You could also use the model path source instead: by pasting the model's link address into ThinkDiffusion MyFiles using upload URL.

Model Name Model Link Address ThinkDiffusion Upload Directory
flux1-kontext-dev.safetensors
📋 Copy Path
.../comfyui/models/diffusion_models/
clip_l.safetensors
📋 Copy Path
.../comfyui/models/clip/
t5xxl_fp16.safetensors
📋 Copy Path
.../comfyui/models/text_encoders/
ae.safetensors
📋 Copy Path
.../comfyui/models/vae/

Step-by-step Workflow Guide

This workflow was pretty easy to set up and runs well from the default settings. Here are a few steps where you might want to take extra note.

Steps Recommended Nodes
1. Load an Image

Load an image. Make sure that it is clear and free from artifacts and blurred. If you want to combine the two image, just enable the 2nd load image.
Total Image Control with Flux Kontext: Complete Tutorial
2. Set the Models

Set the models as seens on the image.
Total Image Control with Flux Kontext: Complete Tutorial
3. Write a Prompt

Write a simple prompt which describes the noun and you want to be for the output. See the examples below for your guidance.
Total Image Control with Flux Kontext: Complete Tutorial
4. Check the Sampling

Set the settings as seen on the image. Do not change the CFG higher as it may cause artifacts.
Total Image Control with Flux Kontext: Complete Tutorial
5. Check the Output

Total Image Control with Flux Kontext: Complete Tutorial

Examples

Character Consistency

Total Image Control with Flux Kontext: Complete Tutorial
Prompt: Use this woman, create an image broadcasting news in television
Total Image Control with Flux Kontext: Complete Tutorial
Prompt: Use this woman, create an image running on a race track

Add / Edit Text

Total Image Control with Flux Kontext: Complete Tutorial
Prompt: Replace the "LAS VEGAS NEVADA" to "SEBASTIAN THE AI EXPERT"
Total Image Control with Flux Kontext: Complete Tutorial
Prompt: Replace the "Wheeler" to "Lonely"

Remove Objects

Total Image Control with Flux Kontext: Complete Tutorial
Prompt: Remove the Beard and moustache, revealing his cleaner face.
Total Image Control with Flux Kontext: Complete Tutorial
Prompt: Remove the bicycles, revealing their legs naturally.

Style Reference

Total Image Control with Flux Kontext: Complete Tutorial
Prompt: Using this style, create an image of New York city at night
Total Image Control with Flux Kontext: Complete Tutorial
Prompt: Using this crochet art style, create a image of a human family in the living room

Switch View

Total Image Control with Flux Kontext: Complete Tutorial
Prompt: Rotate the cat 180 degrees to view directly from behind the cat, showing its back and tail
Total Image Control with Flux Kontext: Complete Tutorial
Prompt: Rotate the camera 90 degrees to view directly from side of the car, showing its side while maintaining its color and shape

Multiple Input Images

Total Image Control with Flux Kontext: Complete Tutorial
Prompt: 1 sweet couple walking at the park and their eyes facing the camera. The woman man is holding a bouquet of flowers and holds the man's arm.
Total Image Control with Flux Kontext: Complete Tutorial
Prompt: 2 pair of shoes displayed at the shoe store.

Change Light

Total Image Control with Flux Kontext: Complete Tutorial
Prompt: Convert to afternoon scene with soft golden sunset light and gentle dusk mist, maintaining the same composition and architectural details
Total Image Control with Flux Kontext: Complete Tutorial
Prompt: Convert to noon scene with sunlight above and gentle noon heat, maintaining the same composition and architectural details

Image Editing

Total Image Control with Flux Kontext: Complete Tutorial
Prompt: Turn the SUV vehicle into green color
Total Image Control with Flux Kontext: Complete Tutorial
Prompt: Turn his coffee into a barbeque

Restyle

Total Image Control with Flux Kontext: Complete Tutorial
Prompt: Transform to watercolor art style
Total Image Control with Flux Kontext: Complete Tutorial
Prompt: Transform this into minimalist lineart style, its an image about the landscape where mountains, sky and trees are visible

Tips for Better Results

  1. Start simple: Begin with basic edits before complex changes
  2. Be specific: Clear descriptions work better than vague ones
  3. Check quality: Monitor for artifacts after each edit
  4. Limit iterations: Avoid more than 5-6 sequential edits
  5. Use good source images: High-quality inputs produce better outputs

Troubleshooting

  • Red nodes: Install missing custom nodes via ComfyUI Manager
  • Model errors: Verify all 4 models are downloaded correctly
  • Poor results: Simplify prompts and retry
  • Artifacts: Reduce CFG settings or start with fresh image




If you’re having issues with installation or slow hardware, you can try any of these workflows on a more powerful GPU in your browser with ThinkDiffusion.

]]>
<![CDATA[MAGREF: Generate AI Videos with Multiple People and Objects from Images]]>https://learn.thinkdiffusion.com/magref-generate-ai-videos-with-multiple-people-and-objects-from-images/685a9ce64b09670001f6e8e7Wed, 02 Jul 2025 13:31:03 GMT
0:00
/0:03
MAGREF: Generate AI Videos with Multiple People and Objects from Images

Prompt: A cute white cat walking along the park. A big green trash bin is visible near the cat.

MAGREF lets you create videos from multiple reference images while keeping each person or object looking consistent throughout the video. Instead of generating random characters, you can use your own photos to control exactly who or what appears in the final video. This guide shows you how to set up and use MAGREF in ComfyUI to create videos with multiple subjects that maintain their original appearance.

What is MAGREF?

0:00
/0:23

Source: MAGREF

MAGREF, or Masked Guidance for Any-Reference Video Generation, is a diffusion-based AI framework that generates videos from multiple reference images while preserving subject identity. It uses region-aware dynamic masking to handle any number of subjects and pixel-wise channel concatenation to maintain fine details. This makes it especially useful for creating videos where specific people, objects, or backgrounds need to appear exactly as they do in your source images.

MAGREF is especially valuable for creators, animators, and researchers seeking to produce customizable, multi-subject videos with exceptional fidelity and control, making it a standout solution for both creative and professional video generation needs.

Types of MAGREF

MAGREF: Generate AI Videos with Multiple People and Objects from Images
Source: Github Repo

MAGREF offers three flexible video generation modes: 

  • Single ID, which uses one reference image to keep a single subject consistent throughout the video; 
  • Multi-ID, which allows multiple subjects from different reference images to appear together while maintaining their unique identities;
  • ID-Object-Background, which lets users combine references for people, objects, and backgrounds to create complex, multi-layered scenes. These options make MAGREF suitable for everything from simple, personalized videos to rich, detailed multi-subject compositions.
0:00
/0:03

Prompt: A man and woman happily hug each other.

With just a few easy steps, you’ll see how effortless and fun it can be to bring your memories to life in ways you never thought possible!

How to Use Wan MAGREF for Video Generation

Installation guide

  1. Download the workflow file
  2. Open ComfyUI (local or ThinkDiffusion)
  3. Drag the workflow file into the ComfyUI window
  4. If you see red nodes, install missing components:
  • ComfyUI Manager > Install Missing Custom Nodes
💡
Update 08/13/2025: Workflow replaced

Verified to work on ThinkDiffusion Build: June 6, 2025

ComfyUI v0.3.47 with the support of
Wan2_1-Wan-I2V-MAGREF-14B_fp16_pure.safetensors model

Note: We specify the build date because ComfyUI and custom node versions updated after this date may change the behavior or outputs of the workflow.

Minimum Machine Size: Ultra

Use the specified machine size or higher to ensure it meets the VRAM and performance requirements of the workflow

💡
Download the workflow and drag & drop it into your ComfyUI window, whether locally or on ThinkDiffusion. If you're using ThinkDiffusion, minimum requirement is the Turbo 24gb machine, but we do recommend the Ultra 48gb machine.

Custom Nodes

If there are red nodes in the workflow, it means that the workflow lacks the certain required nodes. Install the custom nodes in order for the workflow to work.

  1. Go to the ComfyUI Manager  > Click Install Missing Custom Nodes
MAGREF: Generate AI Videos with Multiple People and Objects from Images
  1. Check the list below if there's a list of custom nodes that needs to be installed and click the install.
MAGREF: Generate AI Videos with Multiple People and Objects from Images

Required Models

For this guide you'll need to download these 5 recommended models.

1. Wan2_1-Wan-I2V-MAGREF-14B_fp16_pure
2. umt5_xxl_fp16.safetensors
3. wan_2.1_vae.safetensors
4. Wan21_T2V_14B_lightx2v_cfg_step_distill_lora_rank32.safetensors
5. clip_vision_h.safetensors
  1. Go to ComfyUI Manager  > Click Model Manager
MAGREF: Generate AI Videos with Multiple People and Objects from Images
  1. Search for the models above and when you find the exact model that you're looking for, click install, and make sure to press refresh when you are finished.
MAGREF: Generate AI Videos with Multiple People and Objects from Images

If Model Manager doesn't have them: Use direct download links (included with workflow) and upload through ThinkDiffusion MyFiles > Upload URL. Refer our docs for more guidance on this.

You could also use the model path source instead: by pasting the model's link address into ThinkDiffusion MyFiles using upload URL.

Model Name Model Link Address ThinkDiffusion Upload Directory
Wan2_1-Wan-I2V-MAGREF-14B_fp16_pure.safetensors
📋 Copy Path
.../comfyui/models/diffusion_models/
umt5_xxl_fp16.safetensors
📋 Copy Path
.../comfyui/models/text_encoders/
wan_2.1_vae.safetensors
📋 Copy Path
.../comfyui/models/vae/
Wan21_T2V_14B_lightx2v_cfg_step_distill_lora_rank32.safetensors
📋 Copy Path
.../comfyui/models/lora/
clip_vision_h.safetensors
📋 Copy Path
.../comfyui/models/clip_vision/

Step-by-step Workflow Guide

This workflow was pretty easy to set up and runs well from the default settings. Here are a few steps where you might want to take extra note.

There are 2 Group Nodes that you need to remember.
You can disable first the 2nd Group Node, so you can test first the brightness or saturation each of the input image of 1st Group node before you can run full run of workflow. Different tone of brightness/saturation of each image may lead to uneven color at the output.

Steps Recommended Nodes
1. Load Image
(1st Group)

Load a high quality and should not be pixelated and blurred. Just follow the settings as seen on the image.
MAGREF: Generate AI Videos with Multiple People and Objects from Images
Check Bridge Settings
(optional)

This area is optional on which you set on what type of MAGREF you want to use. If you'll a 1 image only, then use a 1 route bridge only and connect to Resize Image and Get Image Size node. If you need multiple, then use the Image Concatenatte Multi node and reconnect the connection according to your number to inputs and reconnects to the other side.
MAGREF: Generate AI Videos with Multiple People and Objects from Images
2. Set Input Settings
(2nd Group)

Set the input settings for the image. The model is only compatible to 480p and 720p. Take note that frames should be higher than 81 as it may crashed if goes beyond it.
MAGREF: Generate AI Videos with Multiple People and Objects from Images
3. Set Models
(2nd Group)

Set the models as seen on the image.
MAGREF: Generate AI Videos with Multiple People and Objects from Images
4. Write Prompt
(2nd Group)

Write a simple prompt and you don't need to be specific and super detailed. Write that describes the noun and add some simple action words.
MAGREF: Generate AI Videos with Multiple People and Objects from Images
5. Check Sampling Settings
(2nd Group)

Check the samplings settings as seen on the image. Steps should be only 4-6 and CFG should 1-2 only and nothing else.
MAGREF: Generate AI Videos with Multiple People and Objects from Images
6. Check Output
(2nd Group)

Check the generated output. Output may not perfect at one test. If you're not okay with result, just tweak the prompt and rerun the workflow.
MAGREF: Generate AI Videos with Multiple People and Objects from Images

Examples

Single ID Video

0:00
/0:03

Prompt: A man is having a concert at the stage, behind him are his band mates holding instrument.

0:00
/0:03

Prompt: A man wearing this shirt and he is walking at the street.


Multi-ID Video

0:00
/0:03

Prompt: 4 persons are having a meeting at the office, discussing some serious matter.

0:00
/0:03

Prompt: A donkey, an orange cat and a cute white dog were seen walking at the farm.


ID + Object + Background

0:00
/0:03

Prompt: A girl is drinking an Asahi canned beer at her living room with green sofa.

0:00
/0:03

Prompt: A girl reading the Holy Bible in her room.


If you’re having issues with installation or slow hardware, you can try any of these workflows on a more powerful GPU in your browser with ThinkDiffusion.

]]>