Zhenting Qi

Ph.D. Student in Computer Science

Harvard University

About

Welcome! I am a 1st year Computer Science Ph.D. student at Harvard University, where I am honored to be co-advised by Prof. Yilun Du and Prof. Hima Lakkaraju.

My research centers around developing intelligent and reliable AI systems that benefit human society. Motivated by this, I am generally interested in the following topics (w/o particular order):

Reasoning
- Understanding and enhancing reasoning capabilities in foundation models
- Developing AI systems that generalize effectively to OOD scenarios
- Training (multi-)agents for compositional reasoning tasks
Reliability
- Improving understanding of foundation models and AI systems
- Enhancing controllability and robustness
- Designing scalable computational methods for reliability while advancing capabilities

I've had the privilege of working closely with many distinguished researchers, including (the late) Prof. Dragomir R. Radev at Yale, Prof. Volodymyr Kindratenko at UIUC, Dr. Li Lyna Zhang at Microsoft Research Asia, Prof. Eric Xing at CMU, and Prof. James Glass at MIT.

For more information about my research, please see Google Scholar, Semantic Scholar, or DBLP. Please feel free to reach out at: zhentingqi [at] g [dot] harvard [dot] edu.

News

2025-11

Will be joining Meta FAIR (Menlo Park office) as a Research Intern, working on multi-agent training.

2025-09

Our paper EvoLM: In Search of Lost Language Model Training Dynamics has been accepted to NeurIPS 2025 (oral).

2025-05

Our paper Satori: Reinforcement Learning with Chain-of-Action-Thought Enhances LLM Reasoning via Autoregressive Search has been accepted to ICML 2025.

2025-05

Will be joining Google DeepMind (Mountain View office) as a Student Researcher, working on language model post-training.

2025-04

I will continue my research journey at Harvard as a PhD student!

2025-01

Our papers Mutual Reasoning Makes Smaller LLMs Stronger Problem-Solvers, Quantifying Generalization Complexity for Large Language Models, Follow My Instruction and Spill the Beans: Scalable Data Extraction from Retrieval-Augmented Generation Systems have been accepted to ICLR 2025.

Selected Publications

View All →

EvoLM: In Search of Lost Language Model Training Dynamics

🏆 Oral Presentation

Zhenting Qi, Fan Nie, Alexandre Alahi, James Zou, Himabindu Lakkaraju, Yilun Du, Eric Xing, Sham Kakade, Hanlin Zhang

Advances in Neural Information Processing Systems (NeurIPS) 2025

We developed a comprehensive model suite for analyzing language model training dynamics across pre-training, continued pre-training, supervised fine-tuning, and reinforcement learning stages.

arXiv PDF

Satori: Reinforcement Learning with Chain-of-Action-Thought Enhances LLM Reasoning via Autoregressive Search

Maohao Shen^†, Guangtao Zeng^†, Zhenting Qi^†, Zhang-Wei Hong, Zhenfang Chen, Wei Lu, Gregory Wornell, Subhro Das, David Cox, Chuang Gan

International Conference on Machine Learning (ICML) 2025

We introduced the COAT reasoning framework to enhance LLM reasoning via autoregressive search with self-reflection and self-exploration.

arXiv PDF

rStar: Mutual Reasoning Makes Smaller LLMs Stronger Problem-Solvers

Zhenting Qi^†, Mingyuan Ma^†, Jiahang Xu^†, Li Lyna Zhang, Fan Yang, Mao Yang

International Conference on Learning Representations (ICLR) 2025

We introduced rStar, a self-play mutual reasoning approach that enhances reasoning capabilities of small language models without fine-tuning or superior models.

arXiv PDF

Follow My Instruction and Spill the Beans: Scalable Data Extraction from Retrieval-Augmented Generation Systems

Zhenting Qi, Hanlin Zhang, Eric Xing, Sham Kakade, Himabindu Lakkaraju

International Conference on Learning Representations (ICLR) 2025

We developed a scalable method for extracting data from RAG systems using LLMs' instruction-following capabilities.

arXiv PDF

Quantifying Generalization Complexity for Large Language Models

Zhenting Qi, Hongyin Luo, Xuliang Huang, Zhuokai Zhao, Yibo Jiang, Xiangjun Fan, Himabindu Lakkaraju, James Glass

International Conference on Learning Representations (ICLR) 2025

We introduced Scylla, a dynamic evaluation framework that quantitatively measures LLMs' generalization abilities by disentangling generalization from memorization.

arXiv PDF