Logo showcasing transformation with InstructDesign Flow

InstructDesign Flow - FLUX.1 Kontext [dev] Hackathon Submission

Inspiration

Transforming web interfaces has been a long-time dream of mine. As someone who builds interfaces for AI applications constantly, I've always wished for a way to iterate on designs as naturally as describing them. Every time I needed to adjust a theme, apply a new design system, or create mockups, I found myself spending hours in design tools when I could simply describe what I wanted.

The release of FLUX.1 Kontext [dev] was the catalyst that made this dream achievable. Its ability to understand spatial relationships and maintain consistency across transformations meant I could finally train a model that truly understands web design language. This wasn't just about generating pretty pictures - it was about creating a tool that understands the structure and intent of interface design.

What it does

InstructDesign Flow transforms web interfaces through natural language instructions. It's a fine-tuned FLUX.1 Kontext [dev] model that acts as your AI design assistant:

Transform any webpage screenshot with simple text prompts like "make it dark mode" or "apply glassmorphism effects"
Apply 100+ pre-trained design styles including cyberpunk, material design, retro terminals, and artistic themes
Maintain text stability and layout structure while completely transforming the aesthetic
Generate device mockups by placing interfaces on iPhones, MacBooks, or billboards
Enable sequential transformations - take an output and transform it again for iterative design exploration

The model achieves 85%+ instruction adherence while keeping text readable and layouts functional - a critical achievement for practical design work.

How we built it

Building InstructDesign Flow was an intensive 7-day journey spanning over 70 hours of training, inference, and iteration:

Dataset Creation (Days 1-2)

Scraped 5,000+ public webpages for diverse design examples
Generated transformation pairs using automated visual processing
Created instruction captions with LLM-based annotation
Refined to 937 high-quality training pairs after multiple rounds of validation

Training Pipeline (Days 3-7)

The training was executed with surgical precision using a professional setup from day one:

Infrastructure:
- NVIDIA H100 GPU with 80GB VRAM
- FLUX.1 Kontext [dev] base model
- LoRA rank 256 fine-tuning
- 16,000 training steps total

Training iterations:

First attempt (1 days): 350 pairs, baseline quality established
Second iteration (2 days): Expanded dataset, improved captions
Third iteration (2 days): Refinement based on evaluation
Final training (2+ days): Full 937-pair dataset, consolidated approach

The VRAM usage peaked at 80GB of 81GB available - we pushed the hardware to its absolute limits. Using tmux sessions ensured training could continue uninterrupted, and checkpoints were saved every 1,000 steps for safety.

Deployment Architecture

Created a complete end-to-end deployment pipeline:

ComfyUI workflow for optimal inference settings
Docker containerization with automatic model downloads
FastAPI wrapper with queue management and WebSocket support
Next.js frontend with gallery and real-time generation

Challenges we ran into

GPU Constraints

Working with a single GPU in limited time was the biggest challenge. With VRAM maxed out at 81/81GB, there was zero margin for error. One wrong parameter could crash weeks of work. The model was trained 4 times with dataset updates between each iteration - a testament to the iterative nature of achieving quality.

Dataset Quality

Creating a dataset that teaches design intent rather than just visual copying required multiple refinement rounds. Initial results showed the model copying layouts perfectly but missing stylistic instructions. We had to balance between:

Diversity of transformations
Consistency of instruction following
Preservation of text and layout structure

Time Pressure

Limited hackathon timeline meant working in parallel:

Training ran 24/7 while I built the deployment infrastructure
API development happened during model evaluation periods
Frontend was built while the final model checkpoint was still training

Even now, with the full model trained, testing remaining epochs would require more time than available. But the results speak for themselves; the model is doing what it was intended to do very well.

Accomplishments that we're proud of

Text Stability Achievement

The biggest technical accomplishment is maintaining readable text during transformations. Even with a distilled model and semi-synthetic dataset, InstructDesign preserves text content while completely changing aesthetics. This wasn't trivial - it required careful dataset curation and training parameters.

System

In just 7 days, we built:

937 curated training pairs with validated transformations
Fine-tuned LoRA achieving 85%+ instruction adherence
Complete Docker deployment with automatic model downloads
RESTful API with queue management and WebSocket updates
Frontend application with 18 visual examples and one-click reproduction
100+ design presets covering every major design trend

End-to-End Pipeline

From dataset creation to containerization, every component was thoughtfully architected:

Dataset → Training → Model → ComfyUI → Docker → API → Frontend

This demonstrates how AI fine-tunes should be deployed: not just as model weights, but as complete, usable systems.

What we learned

FLUX.1 Kontext Mastery

We learned to push FLUX.1 Kontext [dev] far beyond its typical use cases. The model's ability to understand spatial relationships makes it perfect for design tasks, but unlocking this required:

Specific training configurations for design consistency
Careful prompt engineering for instruction following
Optimal inference settings (CFG=1.0, Guidance=5.0, Steps=20)

Specialization Beats Generalization

Rather than relying on general-purpose models, training a specialized LoRA for web design proved immensely powerful. The focused dataset of 937 pairs outperformed what would require 10,000+ pairs in a general model.

Infrastructure is Critical

Having professional infrastructure from day one - tmux sessions, checkpoint management, Docker deployment - meant zero time wasted on preventable issues. Every hour counted in the hackathon timeline.

What's next for InstructDesign Flow

Immediate Plans

Web Playground: Public interface at instructdesign.ai (domain ready)
Expanded Presets: Growing from 100 to 500+ design transformations
API Marketplace: Offering the model as a service for design tools

Technical Roadmap

Multi-step Workflows: Chain transformations for complex design systems
Component-Level Control: Transform specific UI elements independently
Real-time Collaboration: WebSocket-based collaborative design sessions
Mobile App: iOS/Android apps for on-the-go design transformation

Community & Ecosystem

Open Dataset Expansion: Community-contributed transformation pairs
Plugin Development: Figma, Sketch, and Adobe XD integrations
Educational Content: Tutorials for training custom design LoRAs

Research Directions

FLUX.2 Integration: Preparing for next-generation base models
Video Transformation: Extending to UI animation and micro-interactions
Code Generation: From transformed designs directly to React/Vue components

Business Development

Design agencies seeking rapid prototyping tools
SaaS companies wanting automated theme generation
Educational platforms teaching web design

InstructDesign Flow represents 10 weeks of accumulated knowledge compressed into 7 days of intense execution. It's not just a model - it's a complete system demonstrating how specialized AI should be built, trained, and deployed for real-world impact.

The future of web design isn't replacing designers - it's empowering them to iterate at the speed of thought. InstructDesign Flow is the first step toward that future.

Links:

Built With

comfyui
cuda
docker
fastapi
flux.1-kontext-[dev]
huggingface
lora
next.js
python
react
tailwind-css
typescript
websocket

Updates

Umut T. started this project — Sep 25, 2025 04:34 PM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.