Academic Homepage

I am currently with the HY LLM Team as the principal researcher, where I am responsible for the mid-training stage of large language models. Our work focuses on understanding how synthetic data can be effectively used to scale LLM capabilities.

Previsouly, I am the contributor of Qwen 2.5&3 Series. Graph-Reasoner is a group of works who related to graph problem reasoning, you are welcome to visit it!

Research

We are looking for self-motivated interns with strong hands-on skills who are interested in LLM training, data-centric AI, and large-scale experiments.

We are particularly interested in several open questions:

How can we improve the diversity of synthetic data?
What proportion of synthetic data is optimal?
How can synthetic data scale effectively?
Can synthetic data replace real data?

If you are interested in working on data scaling laws, synthetic data generation, and mid-training strategies for LLMs, feel free to reach out.

News

[12/2025] 🎉Two papers accepted by ICLR 2026.
[09/2025] 🎉One paper accepted by Neurips 2025.
[07/2025] 🎉Holds a tutorial about role-playing in IJCAI 2025.
[05/2025] 🎉One paper accepted by COLM 2025.
[05/2025] 🎉One paper accepted by KDD 2025 and Three papers accepeted by ACL 2025.
[02/2025] 🎉Awarded 2024 Baidu Scholarship🎉
[11/2024] 🎉One paper accepted by COLING 2025 (Oral）.
[09/2024] 🎉🎉Two papers accepted by EMNLP 2024.
[07/2024] I intern in Qwen Foundation Model Team, focusing on pre-training for improving llm reasoning.
[07/2024] Our thoughts of Role-Playing are in this survey: The Oscars of AI Theater: A Survey on Role-Playing with Language Models.
[05/2024] Severed as a reviewer at EMNLP 2024.
[05/2024] 🎉One paper accepted by KDD 2024.
[02/2024] 🥳🥳 Two recent works are avaliable at Arxiv: COMEDY and GraphWiz.
[12/2023] 🥳🥳 Two recent works are avaliable at Arxiv: MathOctopus and LLaMA-Probing.
[12/2023] Severed as a reviewer at NAACL and ACL 2024.
[10/2023] 🎉🎉🎉 Three papers accepted by EMNLP 2023.
[09/2023] Severed as a reviewer at T-PAMI.
[05/2023] 🎉🎉Two papers accepted by ACL 2023.

Selected Publications

Awards

2024

百度奖学金（Baidu Scholarship）

2023

The Most Popular Poster award in Infohub 2023.

2022

Excellent Graduation Thesis of Peking University

2022

北京大学优秀毕业生（Outstanding Graduates of Peking University）

2022

北京市优秀毕业生（Outstanding Graduates of Beijing）

2022

AAAI-2022 Scholarship

2021

KDD Cup 2020 Challenges for Modern E-Commerce Platform: Debiasing Top 4%

2021

Peking University Tiehan Scholarship

2021

Merit student of Peking University

Services

Invited Talk at Shanghai University Of Engineering Science
Reviewer for ACMMM, ACL, EMNLP, AAAI, NeurIPS, NAACL, ICLR, T-PAMI and COLM.

Educations

2022.09 - Now, Phd Candidate, DSA, HKUST.
2019.09 - 2022.06, Master, Science of Computer Enginering, Peking University.
2015.09 - 2019.06, Undergraduate, Beijing University of Posts and Telecommunications.

Miscellaneous

LOL!!! Please feel free to contact me if you have any questions or suggestions: [email protected].