Siyuan Qian

Ph.D. Student @ Peking University · Embodied AI (Vision-Language-Action Model, World Model)

Ph.D. Student @ Peking University · Embodied AI (Vision-Language-Action Model, World Model)

I am currently a Ph.D. student at the National Engineering Research Center of Visual Technology (NERCVT), School of Computer Science, Peking University. I got my B.Eng. in Electronic Information Engineering from Beihang University (BUAA). My supervisor is Prof. Shanghang Zhang.

My research focuses on Embodied AI, exploring the modeling, reasoning, and generalization capabilities of Vision-Language-Action Model (VLA) and World Model in real-world robotic scenarios such as mobile manipulation. I have published papers at top-tier venues including NeurIPS, RSS, and Nature Computational Science, and serve as a reviewer for conferences such as NeurIPS, ICML and ECCV.

News

Oct 10, 2025 Our paper on Implicit Neural Image Field for Biological Microscopy Image Compression has been published in Nature Computational Science!
Sep 18, 2025 Our paper AC-DiT: Adaptive Coordination Diffusion Transformer for Mobile Manipulation has been accepted to NeurIPS 2025!
Sep 01, 2025 Joined Simplexity Robotics as a research intern, working on VLA deployment for real mobile manipulation scenarios.
May 07, 2025 Our paper RoboMIND: Benchmark on Multi-embodiment Intelligence Normative Data for Robot Manipulation has been accepted to RSS 2025!
Aug 27, 2024 Started research internship at BAAI (Beijing Academy of Artificial Intelligence), Embodied Multimodal LLM Research Center.

Publications

2025

  1. NeurIPS 2025
    ac_dit.png
    AC-DiT: Adaptive Coordination Diffusion Transformer for Mobile Manipulation
    Sixiang Chen*, Jiaming Liu*, Siyuan Qian*, Han Jiang, Lily Li, Renrui Zhang, Zhuoyang Liu, Chenyang Gu, Chengkai Hou, Pengwei Wang, Zhongyuan Wang, and Shanghang Zhang
    Neural Information Processing Systems (NeurIPS), 2025
  2. RSS 2025
    robomind.png
    RoboMIND: Benchmark on Multi-embodiment Intelligence Normative Data for Robot Manipulation
    Kun Wu*, Chengkai Hou*, Jiaming Liu*, Zhengping Che*, Xiaozhu Ju*, ..., Siyuan Qian, Shanghang Zhang, and Jian Tang
    Robotics: Science and Systems (RSS), 2025
  3. Nat. Comput. Sci.
    inr_compression.png
    Implicit Neural Image Field for Biological Microscopy Image Compression
    Gaole Dai, ..., Siyuan Qian, Ming Lu, Ali Ata Tuz, and Matthias Gunzer
    Nature Computational Science, 2025

2023

  1. arXiv
    cot.png
    Chain of Thought Prompt Tuning in Vision Language Models
    Jiaxin Ge, Hongyin Luo, Siyuan Qian, Yulu Gan, Jie Fu, and Shanghang Zhang
    arXiv preprint arXiv:2304.07919, 2023