👋About Me

I am a first year Statistics and Data Science PhD student at UCLA, advised by Prof. Kai-Wei Chang and Prof. Ying Nian Wu. Before that, I earned my M.S. in Data Science from Tsinghua University and B.S. in Electronic Engineering from Sun Yat-sen University (SYSU). Currently, I'm a research intern at Microsoft Research, working with Dr. Yeyun Gong, Dr. Yelong Shen, and Dr. Weizhu Chen. My research interests lie at the intersection of generative models, large language models (LLMs), and reinforcement learning (RL).

📖Education

  • Sep. 2025 - Jun. 2030 (Expected) Ph.D., Statistics and Data Science, University of California, Los Angeles, USA

  • Aug. 2022 - Jun. 2025 M.Sc., Data Science and Information Technology, Tsinghua University, Beijing, China.
    GPA: 3.98/4.0, Top 3%

  • Sep. 2018 - Jun. 2022 B.Sc., Electronic Information Science and Technology, Sun Yat-sen University, Guangzhou, China.
    GPA: 4.11/5.0, Top 3%

📑Selected Publications

diseSwS: Self-aware Weakness-driven Problem Synthesis in Reinforcement Learning for LLM Reasoning
Xiao Liang*, Zhong-Zhi Li*, Yeyun Gong, Yang Wang, Hengyuan Zhang, Yelong Shen, Ying Nian Wu, Weizhu Chen
Preprint 2025, [Paper] [Code]

We introduce a Self-aware Weakness-driven problem Synthesis framework that identifies and leverages model weaknesses for problem augmentation in RLVR.

diseTL; DR: Too Long, Do Re-weighting for Effcient LLM Reasoning Compression
Zhong-Zhi Li*, Xiao Liang*, Zihao Tang, Lei Ji, Peijie Wang, Haotian Xu, Xing W, Haizhen Huang, Weiwei Deng, Yeyun Gong, Ying Nian Wu, Zhijiang Guo, Xiao Liu, Fei Yin, Cheng-Lin Liu
Preprint 2025, [Paper] [Code]

We propose a dynamic ratio-based training pipeline that balances the model’s System-1 and System-2 data to reduce redundant reasoning.

diseReinforcement Learning with Verifiable Rewards Implicitly Incentivizes Correct Reasoning in Base LLMs
Xumeng Wen*, Zihan Liu*, Shun Zheng*, Zhijian Xu, Shengyu Ye, Zhirong Wu, Xiao Liang*, Yang Wang, Junjie Li, Ziming Miao, Jiang Bian, Mao Yang
Preprint 2025, [Paper]

We introduce a stricter reasoning-aware metric than pass@k and a supporting theoretical foundation to show that RLVR uniquely incentivizes logically consistent reasoning and generalizes across tasks.

diseIntegrative Decoding: Improve Factuality via Implicit Self-consistency
Yi Cheng, Xiao Liang, Yeyun Gong, Wen Xiao, Song Wang, Yuji Zhang, Wenjun Hou, Kaishuai Xu, Wenge Liu, Wenjie Li, Jian Jiao, Qi Chen, Peng Cheng, Wayne Xiong
ICLR 2025, [Paper]

This paper presents a self-consistency based decoding strategy for improving the factual accuracy of large language models, especially in long-form generation tasks.

diseTask Oriented In-Domain Data Augmentation
Xiao Liang* , Xinyu Hu*, Simiao Zuo, Yeyun Gong, Qiang Lou, Yi Liu, Shao-Lun Huang, Jian Jiao
EMNLP 2024, [Paper]

We propose a task-oriented in-domain data augmentation framework consisting of in-domain data selection and task-oriented synthetic passage generation.

diseChunk, Align, Select: A Simple Long-sequence Processing Method for Transformers
Jiawen Xie, Pengyu Cheng, Xiao Liang, Yong Dai, Nan Du
ACL 2024, [Paper] [Code]

We propose a token selection framework for pre-trained transformers to process long sequences utilizing reinforcement learning.

diseChain-of-Reasoning: Towards Unified Mathematical Reasoning in Large Language Models via a Multi-Paradigm Perspective
Yiyao Yu, Yuxiang Zhang, Dongdong Zhang, Xiao Liang, Hengyuan Zhang, Xingxing Zhang, Ziyi Yang, Mahmoud Khademi, Hany Awadalla, Junjie Wang, Yujiu Yang, Furu Wei
ACL 2025, [Paper]

We introduce a framework that integrates Natural Language Reasoning, Algorithmic Reasoning, and Symbolic Reasoning to enable synergistic collaboration for LLMs.

(* indicates equal contribution)

🧑‍💻Experience

  • (Nov. 2023 - Present) Reserch Intern, NLC Group, Microsoft Research Asia, Beijing, China.
    Mentor: Yeyun Gong, Weizhu Chen
    Working on large language models, pre-training strategy and model architecture.

  • (Mar. 2023 - Sep. 2023) Research Intern, AI Lab, Tencent Inc., Guangdong, China.
    Mentor: Pengyu Cheng, Nan Du
    Working on large language models, long sequence processing.

🏆Honors and Awards

  • Outstanding Master Graduate Thesis, Tsinghua University, Beijing, 2025
  • Second Prize Scholarship, Tsinhua University, Beijing, 2024
  • Outstanding Graduate Thesis, Sun Yat-sen University, Guangdong, 2022
  • Outstanding Graduate Student, Sun Yat-sen University, Guangdong, 2022
  • Second Prize Scholarship, Sun Yat-sen University, Guangdong, 2019~2022
  • 1st of the 2022 Tsinghua Open Hack Competition - Multimodal Learning Track, Tsinghua University, Beijing, 2022