UI-S1

UI-S1: Advancing GUI Automation via Semi-online Reinforcement Learning

[📖 Paper] [🤗 UI-S1-7B] [🤗 Daily Paper] [🤗 Dataset]

🔥 Overview

We present Semi-online RL, a novel paradigm that simulates online reinforcement learning using offline trajectories, thereby enabling the efficient training of MLLM-based GUI agents with enhanced multi-turn interaction capabilities.

Ours UI-S1-7B achieves SOTA performance on both semi-online metric (SOP) and online metric (AndroidWorld) among open-source 7B models.

🗞️ News

2025-04-06: 🔥 UI-S1 was accepted by ACL 2026 main conference.
2025-10-28: We release part of our training dataset.
2025-09-17: We release the UI-S1 training and evaluation code.
2025-09-16: We release the checkpoints of UI-S1-7B model.
2025-09-16: We release our paper.

Detailed results

Setup

conda create -n ui-s1 python=3.11
conda activate ui-s1
cd UI-S1
pip install -e .
pip install vllm==0.8.2
pip install flash-attn==2.7.4.post1 --no-build-isolation
# or Installed wheel from https://github.com/Dao-AILab/flash-attention/releases/tag/v2.7.4.post1
# pip install flash_attn-2.7.4.post1+cu12torch2.6cxx11abiFALSE-cp311-cp311-linux_x86_64.whl

We use swanlab for training visulization. Replace your own swanlab api key and host in verl/utils/tracking.py

Data

Download AndroidControl into datasets/AndroidControl/images and datasets/android_control_train_example.jsonl
[New] We also offer 1000 training examples on Dataset.

Train

bash scripts/train_example.sh
python scripts/model_merger.py merge --local_dir checkpoints/XXX

Inference and evaluation

# 1. Launch the vLLM server
vllm serve /checkpoints-7B --served-model-name UI-S1-7B --tensor_parallel_size 1 --trust-remote-code --limit-mm-per-prompt image=2

# 2. Evaluate UI-S1-7B's performance on SOP
python /evaluation/eval_qwenvl.py --model_name UI-S1-7B

# Evaluate other models
python /evaluation/eval_qwenvl.py --model_name Qwen2.5-VL-7B
python /evaluation/eval_agentcpm.py --model_name AgentCPM-GUI-8B
python /evaluation/eval_os-atlas-7b.py --model_name OS-Atlas-7B
python /evaluation/eval_os-genesis-7b.py --model_name OS-Genesis-7B
python /evaluation/eval_ui-tars-7b.py --model_name UI-TARS-7B

⭐️ Citation

If you find this project useful, welcome to cite us.

@article{lu2025ui,
  title={UI-S1: Advancing GUI Automation via Semi-online Reinforcement Learning},
  author={Lu, Zhengxi and Ye, Jiabo and Tang, Fei and Shen, Yongliang and Xu, Haiyang and Zheng, Ziwei and Lu, Weiming and Yan, Ming and Huang, Fei and Xiao, Jun and others},
  journal={arXiv preprint arXiv:2509.11543},
  year={2025}
}

🤝 Acknowledgements

We sincerely thank projects verl and verl-agent.

Name		Name	Last commit message	Last commit date
parent directory ..
assets		assets
datasets		datasets
evaluation		evaluation
examples		examples
scripts		scripts
uis1		uis1
verl.egg-info		verl.egg-info
verl		verl
x		x
README.md		README.md
README_zh.md		README_zh.md
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

UI-S1: Advancing GUI Automation via Semi-online Reinforcement Learning

🔥 Overview

🗞️ News

Detailed results

Setup

Data

Train

Inference and evaluation

⭐️ Citation

🤝 Acknowledgements

FilesExpand file tree

UI-S1

Directory actions

More options

Directory actions

More options

Latest commit

History

UI-S1

Folders and files

parent directory

README.md

UI-S1: Advancing GUI Automation via Semi-online Reinforcement Learning

🔥 Overview

🗞️ News

Detailed results

Setup

Data

Train

Inference and evaluation

⭐️ Citation

🤝 Acknowledgements