|
Xiaofang Wang
I am a Staff Research Scientist at Meta Superintelligence Labs (MSL), working on multimodal post-training and reasoning. I received my Ph.D. from the Robotics Institute at Carnegie Mellon University and B.S. in Computer Science from Peking University.
E-mail /
Google Scholar /
LinkedIn
|
|
|
Experience
|
- 02/2023 - current: Research Scientist @ Meta Superintelligence Labs
- 06/2022 - 02/2023: Research Scientist @ Mobile Vision, Meta Reality Labs
- 05/2020 - 08/2020: Research Intern @ Google Perception
- 05/2019 - 08/2019: Research Intern @ Google Cloud AI
|
|
Education
|
- 08/2017 - 05/2022: Ph.D. in Robotics, Carnegie Mellon University (advisor: Kris Kitani)
- 08/2015 - 05/2017: M.S. in Robotics, Carnegie Mellon University (advisors: Kris Kitani, Martial Hebert)
- 08/2011 - 07/2015: B.S. in Computer Science, Peking University
|
|
Building a Mind Palace: Structuring Environment-Grounded Semantic Graphs for Effective Long Video Analysis with LLMs
Zeyi Huang, Yuyang Ji, Xiaofang Wang, Nikhil Mehta, Tong Xiao, Donghyun Lee, Sigmund Vanvalkenburgh, Shengxin Zha, Bolin Lai, Licheng Yu, Ning Zhang, Yong Jae Lee, Miao Liu
Computer Vision and Pattern Recognition Conference (CVPR), 2025
|
|
Apollo: An Exploration of Video Understanding in Large Multimodal Models
Orr Zohar, Xiaohan Wang, Yann Dubois, Nikhil Mehta, Tong Xiao, Philippe Hansen-Estruch, Licheng Yu, Xiaofang Wang, Felix Juefei-Xu, Ning Zhang, Serena Yeung-Levy, Xide Xia
Computer Vision and Pattern Recognition Conference (CVPR), 2025
|
|
Accelerating Multimodal Large Language Models by Searching Optimal Vision Token Reduction
Shiyu Zhao, Zhenting Wang, Felix Juefei-Xu, Xide Xia, Miao Liu, Xiaofang Wang, Mingfu Liang, Ning Zhang, Dimitris N Metaxas, Licheng Yu
Computer Vision and Pattern Recognition Conference (CVPR), 2025
|
|
ControlRoom3D: Room Generation using Semantic Proxy Rooms
Jonas Schult, Sam Tsai, Lukas Höllein, Bichen Wu, Jialiang Wang, Chih-Yao Ma, Kunpeng Li, Xiaofang Wang, Felix Wimbauer, Zijian He, Peizhao Zhang, Bastian Leibe, Peter Vajda, Ji Hou
Computer Vision and Pattern Recognition Conference (CVPR), 2024
|
|
Cost-Aware Evaluation and Model Scaling for LiDAR-Based 3D Object Detection
Xiaofang Wang, Kris M. Kitani
International Conference on Robotics and Automation (ICRA), 2023
|
|
Wisdom of Committees: An Overlooked Approach To Faster and More Accurate Models
Xiaofang Wang, Dan Kondratyuk, Eric Christiansen, Kris M. Kitani, Yair Alon, Elad Eban
International Conference on Learning Representations (ICLR), 2022
[Poster]
[Google AI Blog]
|
|
Neighborhood-Aware Neural Architecture Search
Xiaofang Wang, Shengcao Cao, Mengtian Li, Kris M. Kitani
British Machine Vision Conference (BMVC), 2021
|
|
AttentionNAS: Spatiotemporal Attention Cell Search for Video Classification
Xiaofang Wang, Xuehan Xiong, Maxim Neumann, AJ Piergiovanni, Michael S. Ryoo,
Anelia Angelova, Kris M. Kitani, Wei Hua
European Conference on Computer Vision (ECCV), 2020
[Video-1 minute]
[Video]
[Slides]
|
|
Learnable Embedding Space for Efficient Neural Architecture Compression
Shengcao Cao*, Xiaofang Wang*, Kris M. Kitani
International Conference on Learning Representations (ICLR), 2019
* indicates equal contribution.
[Code]
[Poster]
[Architecture Visualization]
|
|
Error Correction Maximization for Deep Image Hashing
Xiang Xu, Xiaofang Wang, Kris M. Kitani
British Machine Vision Conference (BMVC), 2018
|
|
Deep Supervised Hashing with Triplet Labels
Xiaofang Wang, Yi Shi, Kris M. Kitani
Asian Conference on Computer Vision (ACCV), 2016
Oral Presentation, (5.6% acceptance rate)
[Code]
|
|
Hamming Compatible Quantization for Hashing
Zhe Wang, Ling-Yu Duan, Jie Lin, Xiaofang Wang, Tiejun Huang, Wen Gao
International Joint Conference on Artificial Intelligence (IJCAI), 2015
|
|