Siyuan Qian
Ph.D. Student @ Peking University · Embodied AI (Vision-Language-Action Model, World Model)
I am currently a Ph.D. student at the National Engineering Research Center of Visual Technology (NERCVT), School of Computer Science, Peking University. I got my B.Eng. in Electronic Information Engineering from Beihang University (BUAA). My supervisor is Prof. Shanghang Zhang.
My research focuses on Embodied AI, exploring the modeling, reasoning, and generalization capabilities of Vision-Language-Action Model (VLA) and World Model in real-world robotic scenarios such as mobile manipulation. I have published papers at top-tier venues including NeurIPS, RSS, and Nature Computational Science, and serve as a reviewer for conferences such as NeurIPS, ICML and ECCV.
News
| Oct 10, 2025 | Our paper on Implicit Neural Image Field for Biological Microscopy Image Compression has been published in Nature Computational Science! |
|---|---|
| Sep 18, 2025 | Our paper AC-DiT: Adaptive Coordination Diffusion Transformer for Mobile Manipulation has been accepted to NeurIPS 2025! |
| Sep 01, 2025 | Joined Simplexity Robotics as a research intern, working on VLA deployment for real mobile manipulation scenarios. |
| May 07, 2025 | Our paper RoboMIND: Benchmark on Multi-embodiment Intelligence Normative Data for Robot Manipulation has been accepted to RSS 2025! |
| Aug 27, 2024 | Started research internship at BAAI (Beijing Academy of Artificial Intelligence), Embodied Multimodal LLM Research Center. |