Jian Zhang

Master's student at Xiamen University

personal/jian_zhang_sanya.jpeg

Hello, I am Jian (Dylan). Over the past two years, I have had an impactful collaboration with Zhiwen, through which I developed core research skills and a clear long-term goal: building systems that can perceive, decide, and act in the physical world like humans. I believe this direction can fundamentally reshape society. I plan to start my PhD at Texas A&M University in Fall 2026 or Spring 2027.

My recent projects include VLM-3R, DynamicVerse, Large Spatial Model, and InstantSplat. In my early stage, I focused on faster 3D reconstruction and semantic 3D representation, with some exploration in video generation. I am now increasingly focused on intelligence for embodied systems in the physical world.

I am currently seeking internship opportunities. Feel free to contact me by email.

selected publications

  1. vlm3r.gif
    VLM-3R: Vision-Language Models Augmented with Instruction-Aligned 3D Reconstruction
    Zhiwen Fan*, Jian Zhang*, Renjie Li, and 8 more authors
    In CVPR, 2026
  2. lsm.gif
    Large spatial model: End-to-end unposed images to semantic 3d
    Zhiwen Fan*, Jian Zhang*, Wenyan Cong, and 8 more authors
    In NeurIPS, 2024