Skip to content

Latest commit

 

History

History

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
 
 
 
 
 
 

README.md

Mobile-Agent-v3

📢News

  • 🔥[9.16] We've open-sourced the code of GUI-Owl and Mobile-Agent-v3 on OSWorld benchmark.
  • 🔥[9.10] We've open-sourced the code of AndroidWorld benchmark and real-world mobile scenarios for GUI-Owl and Mobile-Agent-v3.
  • [8.10] We release GUI-Owl-7B and GUI-Owl-32B.
  • [8.10] The technical report can be found here

📍 TODO

  • Open source code of Mobile-Agent-v3 on real-world mobile scenarios
  • Open source evaluation code for GUI-Owl and Mobile-Agent-v3 on AndroidWorld
  • Open source code of Mobile-Agent-v3 on real-world PC scenarios
  • Open source evaluation code on OSWorld

Introduction

GUI-Owl is a model series developed as part of the Mobile-Agent-v3 project. It achieves state-of-the-art performance across a range of GUI automation benchmarks, including ScreenSpot-v2, ScreenSpot-Pro, OSWorld-G, MMBench-GUI, Android Control, Android World, and OSWorld. Furthermore, it can be instantiated as various specialized agents within the Mobile-Agent-v3 multi-agent framework to accomplish more complex tasks.

Deploy Mobile-Agent-v3 on your mobile device.

❗At present, only Android OS and Harmony OS support tool debugging. Other systems, such as iOS, do not support the use of Mobile-Agent for the time being.

Install the dependencies required by the qwen model.

pip install qwen_agent
pip install qwen_vl_utils
pip install numpy

Preparation for Connecting Mobile Device with ADB

  1. Download the Android Debug Bridge.
  2. Turn on the ADB debugging switch on your Android phone, it needs to be turned on in the developer options first. If it is the HyperOS system, you need to turn on USB Debugging (Security Settings) at the same time.
  3. Connect your phone to the computer with a data cable and select "Transfer files".
  4. Test your ADB environment as follow: /path/to/adb devices. If the connected devices are displayed, the preparation is complete.
  5. If you are using a MAC or Linux system, make sure to turn on adb permissions as follow: sudo chmod +x /path/to/adb
  6. If you are using Windows system, the path will be xx/xx/adb.exe

Install the ADB Keyboard on your Mobile Device

  1. Download the ADB keyboard apk installation package.
  2. Click the apk to install on your mobile device.
  3. Switch the default input method in the system settings to "ADB Keyboard".

Run

Android

cd Mobile-Agent-v3/mobile_v3
python run_mobileagentv3.py \
    --adb_path "Your ADB path" \
    --api_key "Your api key of vllm service" \
    --base_url "Your base url of vllm service" \
    --model "Your model name of vllm service" \
    --instruction "The instruction you want Mobile-Agent-v3 to complete" \
    --add_info "Some supplementary knowledge, can also be empty"

HarmonyOS

cd Mobile-Agent-v3/mobile_v3
python run_mobileagentv3.py \
    --hdc_path "Your HDC path" \
    --api_key "Your api key of vllm service" \
    --base_url "Your base url of vllm service" \
    --model "Your model name of vllm service" \
    --instruction "The instruction you want Mobile-Agent-v3 to complete" \
    --add_info "Some supplementary knowledge, can also be empty"

Note

  1. If the model you are using outputs relative coordinates from 0 to 1000, such as Seed-VL or Qwen-VL-2 or Qwen-VL-3, please set:
--coor_type "qwen-vl" # This means that coordinates from 0-1000 are mapped to the actual device resolution.

Note: If the model you are using outputs absolute coordinates. such as Qwen-VL-2.5 or GUI-Owl, please do not set coordinate mapping.

  1. If your instruction needs to remember content, please set:
--notetaker True

Evaluation on OSWorld

  1. Please follow the official code repository to install the OSWorld and necessary dependencies.

  2. Fill in your vllm service information in the run_guiowl.sh or run_ma3.sh script, including api_key, base_url, and model.

  3. Run the evaluation.

cd MobileAgent/Mobile-Agent-v3/os_world_v3
sh run_guiowl.sh
sh run_ma3.sh

Evaluation on AndroidWorld

  1. Please follow the official code repository to install the Android emulator and necessary dependencies.

  2. Install the dependencies required by the qwen model.

pip install qwen_agent
pip install qwen_vl_utils
  1. Fill in your vllm service information in the run_guiowl.sh or run_ma3.sh script, including api_key, base_url, and model.

  2. Run the evaluation.

cd MobileAgent/Mobile-Agent-v3/androld_world_v3
sh run_guiowl.sh
sh run_ma3.sh
  1. We provide evaluation trajectory and logs for viewing.

Performance

ScreenSpot-v2, ScreenSpot-Pro and OSWorld-G

MMBench-GUI L1, L2 and Android Control

Android World and OSWorld-Verified

Usage

Please refer to our cookbook.

Deploy

Please refer to the README of model card on HuggingFace for optimized performance.

Citation

If you find our paper and model useful in your research, feel free to give us a cite.

@article{ye2025mobile,
  title={Mobile-Agent-v3: Foundamental Agents for GUI Automation},
  author={Ye, Jiabo and Zhang, Xi and Xu, Haiyang and Liu, Haowei and Wang, Junyang and Zhu, Zhaoqing and Zheng, Ziwei and Gao, Feiyu and Cao, Junjie and Lu, Zhengxi and others},
  journal={arXiv preprint arXiv:2508.15144},
  year={2025}
}