GlyphPrinter: Region-Grouped Direct Preference Optimization for Glyph-Accurate Visual Text Rendering

Xincheng Shuai^1,*, Ziye Li^1,*, Henghui Ding^1,✉, Dacheng Tao²

* Equal Contribution, ✉ Corresponding Author

¹Fudan University, ²Nanyang Technological University

🔥🔥🔥 News

[2026/03/15] Release the training code and GlyphCorrector dataset. 🤗 GlyphCorrector.
[2026/03/13] Release the inference code and model weights. 🤗 Model Weight.
[2026/02/21] GlyphPrinter is accepted to CVPR 2026. 👏👏

😊 Introduction

GlyphPrinter is a preference-based text rendering framework designed to eliminate the reliance on explicit reward models for visual text generation. It addresses the common failure cases in existing T2I models, such as stroke distortions and incorrect glyphs, especially when rendering complex Chinese characters, multilingual text, or out-of-domain symbols.

🔧 Key Features

GlyphCorrector Dataset: A specialized dataset with region-level glyph preference annotations, facilitating the model's ability to learn localized glyph correctness.
R-GDPO (Region-Grouped Direct Preference Optimization): Unlike standard DPO which models global image-level preferences, R-GDPO focuses on local regions where glyph errors typically occur. It optimizes inter- and intra-sample preferences over annotated regions to significantly enhance glyph accuracy.
Regional Reward Guidance (RRG): A novel inference strategy that samples from an optimal distribution with controllable glyph accuracy.

👷 Pipeline

The training of GlyphPrinter consists of two stages:

Stage 1 (Fine-Tuning): The model is first fine-tuned on multilingual synthetic and realistic text images to establish a strong baseline for text rendering.
Stage 2 (Region-Level Preference Optimization): The model is optimized using the R-GDPO objective on the GlyphCorrector dataset. This stage aligns model outputs with accurate glyph regions while discouraging incorrect ones, resulting in superior glyph fidelity.

💻 Quick Start

Environment setup

cd GlyphPrinter
conda create -n GlyphPrinter python=3.11.10 -y
conda activate GlyphPrinter

Requirements installation

pip install --upgrade -r requirements.txt

Inference

python app.py

Default server port: 7897.

CLI inference without Gradio (directly load conditions from saved_conditions directory, you can manually construct the npz-format condition through app.py)

# list available saved conditions
python3 inference.py --list-conditions

# run inference using the latest condition in saved_conditions/
python3 inference.py \
  --prompt "The colorful graffiti font <sks1> printed on the street wall" \
  --save-mask

# run inference using a specific condition file
python3 inference.py \
  --condition condition_1.npz \
  --output-dir outputs_inference

🏃 R-GDPO Training

1. Prepare GlyphCorrector dataset

Please first download our regional preference dataset GlyphCorrector.

Then, place it under dataset/GlyphCorrector:

mkdir -p dataset
huggingface-cli download FudanCVL/GlyphCorrector GlyphCorrector.zip \
  --repo-type dataset \
  --local-dir dataset \
  --local-dir-use-symlinks False
unzip -q dataset/GlyphCorrector.zip -d dataset

After extraction, verify the folder structure:

dataset/GlyphCorrector/
├── annotated_mask/
│   ├── batch_0/
│   │   ├── generated_0_mask.jpg
│   │   └── ...
│   └── batch_1/
└── inference_results/
    ├── batch_0/
    │   ├── generated_0.png
    │   ├── glyph_0.png
    │   ├── mask_0.png
    │   ├── prompt.txt
    │   └── ...
    └── batch_1/

2. Run R-GDPO training

Use the provided script for R-GDPO training:

bash dpo/train_dpo_group.bash

⚙️ Default Model Settings

Base FLUX model: black-forest-labs/FLUX.1-dev
Stage1 Transformer path: pretrained/pretrained_stage1_attn_mask_transformer-stage-1-2
Stage2 LoRA path: pretrained/dpo-checkpoint

💗 Citation

@inproceedings{GlyphPrinter,
        title={{GlyphPrinter}: Region-Grouped Direct Preference Optimization for Glyph-Accurate Visual Text Rendering},
        author={Shuai, Xincheng and Li, Ziye and Ding, Henghui and Tao, Dacheng},
        booktitle={CVPR},
        year={2026}
      }

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
assets		assets
diffusers		diffusers
dpo		dpo
pretrained		pretrained
saved_conditions		saved_conditions
src		src
.gitignore		.gitignore
README.md		README.md
app.py		app.py
inference.py		inference.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

GlyphPrinter: Region-Grouped Direct Preference Optimization for Glyph-Accurate Visual Text Rendering

🔥🔥🔥 News

😊 Introduction

🔧 Key Features

👷 Pipeline

💻 Quick Start

🏃 R-GDPO Training

1. Prepare GlyphCorrector dataset

2. Run R-GDPO training

⚙️ Default Model Settings

💗 Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

GlyphPrinter: Region-Grouped Direct Preference Optimization for Glyph-Accurate Visual Text Rendering

🔥🔥🔥 News

😊 Introduction

🔧 Key Features

👷 Pipeline

💻 Quick Start

🏃 R-GDPO Training

1. Prepare GlyphCorrector dataset

2. Run R-GDPO training

⚙️ Default Model Settings

💗 Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages