Yifei Wang
I am currently a Member of Technical Staff at the Amazon AGI SF Lab, led by David Luan and Pieter Abbeel. I work with an amazing team pushing long-term frontier research in large language models, reinforcement learning, and autonomous agents.
Previously, I was a postdoctoral researcher at MIT CSAIL, advised by Stefanie Jegelka. I received my Ph.D. in Applied Mathematics from Peking University, advised by Yisen Wang, Zhouchen Lin, and Jiansheng Yang. I also completed my B.S. and B.A. at Peking University.
My research interests lie broadly in unsupervised learning, representation learning, and reinforcement learning, with the overarching goal of developing more scalable and generalizable learning paradigms. My work has received 5 best paper awards and has been featured by MIT News and Anthropic. I serve as an Area Chair for ICLR and ICML.
selected
- ICLR Workshop
Best Paper Runner-up When More is Less: Understanding Chain-of-Thought Length in LLMsICLR 2025 Workshop on Reasoning and Planning for LLMs, 2025🏆 Best Paper Runner-up Award. Cited over 100 times - NeurIPS
Best Paper Award
at ICML-W’24A Theoretical Understanding of Self-Correction through In-context AlignmentIn NeurIPS, 2024🏆 Best Paper Award at ICML 2024 ICL Workshop
We proposed the first theoretical explanation of how LLM self-correction works (as in OpenAI o1) and showed its effectiveness against social bias and jailbreak attacks. - ICLR
JMLR Chaos is a Ladder: A New Theoretical Understanding of Contrastive Learning via Augmentation OverlapIn ICLR, 2022A new augmentation overlap theory for understanding the generalization of contrastive learning. Cited over 150 times. Extended version was accepted at JMLR.
news
| December, 2025 | Our paper G1 has received the Best Paper Award at the NeurIPS 2025 NPGML Workshop. |
|---|---|
| September, 2025 | Four papers were accepted at NeurIPS 2025, including RL-driven graph reasoning with LLMs (G1), signed graph propagation, hierarchical diffusion LMs, and ranking-based LLM decoding. See you at San Diego! |
| August, 2025 | I gave an invited talk Two New Dimensions of Sparsity for Scaling LLMs at Google DeepMind’s Gemini team, covering our recent work on sparse long-context training (ICLR 2025) and sparse embedding (ICML 2025 Oral). |
| June, 2025 | Our ICML 2025 paper was featured in an MIT News article, Unpacking the bias of large language models, where we identified and theoretically proved the root causes of position bias in Transformers. |
| June, 2025 | I gave an invited talk at the ASAP Seminar on Your Next-Token Prediction and Transformers Are Biased for Long-Context Modeling—see the recording at YouTube. |
| May, 2025 | Three papers were accepted to ICML 2025. Our oral presentation (top 1%) introduces contrastive sparse representations (CSR) to compress state-of-the-art embedding models to just 32 active dimensions, enabling ~100× faster retrieval with minimal accuracy loss and low training cost for large-scale vector databases and RAG systems. |