Shaohang Wei


avatar I am a second-year Ph.D. student in Computer Science at Peking University, advised by Prof. Houfeng Wang. My research interests broadly span language modeling, with a particular focus on continual learning and self-evolving of LLMs/Agents. I strive to understand the learning behaviors of LLMs and seek ways to enable them to continuously improve and generalize across multiple domains. You can also find me on Twitter/X and REDNote.
I am open to collaborations and discussions, and I am actively seeking research intern opportunities.
Email: [email protected]
avatar

Representative Works

TIME preview

TIME: A Multi-level Benchmark for Temporal Reasoning of LLMs in Real-World Scenarios

Shaohang Wei, Wei Li, Feifan Song, Wen Luo, Tianyi Zhuang, Haochen Tan, Zhijiang Guo, Houfeng Wang
NeurIPS 2025 Spotlight (D&B Track)
Sparse but Critical preview

Sparse but Critical: A Token-Level Analysis of Distributional Shifts in RLVR Fine-Tuning of LLMs

Haoming Meng, Kexin Huang, Shaohang Wei, Chiyu Ma, Shuo Yang, Xue Wang, Guoyin Wang, Bojian Ding, Jingren Zhou
ICLR 2026
Odysseus preview

Odysseus Navigates the Sirens' Song: Dynamic Focus Decoding for Factual and Diverse Open-Ended Text Generation

Wen Luo, Feifan Song, Wei Li, Guangyue Peng, Shaohang Wei, Houfeng Wang
ACL 2025 Main
Weak-to-Strong preview

Well Begun is Half Done: Low-resource Preference Alignment by Weak-to-Strong Decoding

Feifan Song, Shaohang Wei, Wen Luo, Yuxuan Fan, Tianyu Liu, Guoyin Wang, Houfeng Wang
ACL 2025 Findings
MeNTi preview

MeNTi: Bridging Medical Calculator and LLM Agent with Nested Tool Calling

Yakun Zhu, Shaohang Wei, Xu Wang, Kui Xue, Shaoting Zhang, Xiaofan Zhang
NAACL 2025 Main
MindScore preview

MindScore: Quantifying Human Preference for Text-to-image Generation Through Multi-view Lens

Yiqi Tong*, Jiarui Zhang*, Shaohang Wei* (equal contribution), Wei Guo, Fuzhen Zhuang, Deqing Wang, Xi Yang, Richeng Xuan
Science China Information Sciences (2025)

Publications

  1. TimE: A Multi-level Benchmark for Temporal Reasoning of LLMs in Real-World Scenarios. S. Wei, W. Li, F. Song, W. Luo, T. Zhuang, H. Tan, Z. Guo, H. Wang. NeurIPS 2025 Spotlight (D&B Track) .
  2. Sparse but Critical: A Token-Level Analysis of Distributional Shifts in RLVR Fine-Tuning of LLMs. H. Meng, K. Huang, S. Wei, C. Ma, S. Yang, X. Wang, G. Wang, B. Ding, J. Zhou. ICLR 2026.
  3. Odysseus Navigates the Sirens' Song: Dynamic Focus Decoding for Factual and Diverse Open-Ended Text Generation. W. Luo, F. Song, W. Li, G. Peng, S. Wei, H. Wang. ACL 2025 Main.
  4. Well Begun is Half Done: Low-resource Preference Alignment by Weak-to-Strong Decoding. F. Song, S. Wei, W. Luo, Y. Fan, T. Liu, G. Wang, H. Wang. ACL 2025 Findings.
  5. MeNTi: Bridging Medical Calculator and LLM Agent with Nested Tool Calling. Y. Zhu, S. Wei, X. Wang, K. Xue, X. Zhang, S. Zhang. NAACL 2025 Main.
  6. MindScore: quantifying human preference for text-to-image generation through multi-view lens. Y. Tong, J. Zhang, S. Wei, W. Guo, F. Zhuang, D. Wang, X. Yang, R. Xuan. SCIS 2025 (CCF-A).

Selected Preprints

  1. SWE-Ext: Extending and Scaling Augmented Data for Repository-Level Coding Tasks. W. Li, X. Zhang, S. Wei, Y. Gao, Z. Guo, W. Luo, F. Song, Y. Huang, H. Wang. paper.
  2. Two Pathways to Truthfulness: On the Intrinsic Encoding of LLM Hallucinations. W. Luo, G. Peng, W. Li, S. Wei, F. Song, L. Wang, N. Yang, X. Zhang, J. Jin, F. Wei, H. Wang. arXiv.
  3. Mitigating Overthinking through Reasoning Shaping. F. Song, S. Wei, B. Gao, Y. Wang, W. Luo, W. Li, L. Yao, W. Xiong, L. Chen, T. Liu, H. Wang. arXiv.
  4. Probability-Entropy Calibration: An Elastic Indicator for Adaptive Fine-tuning. W. Yu, S. Wei, J. Liu, Y. Li, M. Hu, A. Liu, H. Zhang, I. King. arXiv.
  5. CiteCheck: Towards Accurate Citation Faithfulness Detection. Z. Xu, S. Wei, Z. Han, J. Jin, Z. Yang, X. Li, H. Tan, Z. Guo, H. Wang. arXiv.