| CARVIEW |
Select Language
HTTP/2 200
content-type: text/html
last-modified: Mon, 01 Dec 2025 02:40:31 GMT
accept-ranges: bytes
content-encoding: gzip
vary: Accept-Encoding
content-length: 4438
date: Sun, 28 Dec 2025 08:36:31 GMT
server: LiteSpeed
alt-svc: h3=":443"; ma=2592000, h3-29=":443"; ma=2592000, h3-Q050=":443"; ma=2592000, h3-Q046=":443"; ma=2592000, h3-Q043=":443"; ma=2592000, quic=":443"; ma=2592000; v="43,46"
Cyril Zhang
Hi! I'm a researcher at OpenAI. In an academic capacity, I've worked on the theory and science of deep learning and its adjacent fields. I'm currently doing the same, at frontier scales.
Some cherished research topics (and haunting questions):
- The "microstructure" of deep learning. How do the right (and wrong) data-adapted circuits emerge, mechanistically? Which "toy" problems provide exceptional clarity on the weird and wonderful phenomena involved? Which mathematical tools and solution concepts from {statistics, optimization, theoretical computer science} reveal good design spaces for scalable representation learning?
- Temporality, dynamical systems, and control. How do we make sense of life's {discrete, continuous} sequences? Towards doing so efficiently and reliably, how do we surmount the perennial challenges of {non-convexity, non-stationarity, misspecification}?
I was previously at Microsoft Research NYC, a wondrous catalyst for unclassifiable science and creative thought. Before that, I studied computer science at Princeton (Ph.D. '20, advised by Prof. Elad Hazan) and Yale (B.S. '15), and grew up in Toronto.
Publications
With Edoardo Botta, Yuchen Li, Aashay Mehta, Jordan Ash, and Andrej Risteski.
Phi-4 Technical Report. Preprint.
With the Microsoft Phi team.
Self-Improvement in Language Models: The Sharpening Mechanism. ICLR 2025 (oral).
With Audrey Huang, Adam Block, Dylan Foster, Dhruv Rohatgi, Max Simchowitz, Jordan Ash, and Akshay Krishnamurthy.
With the Microsoft Phi team.
Can Large Language Models Explore In-Context? NeurIPS 2024.
With Akshay Krishnamurthy, Keegan Harris, Dylan Foster, and Alex Slivkins.
Butterfly Effects of SGD Noise: Error Amplification in Behavior Cloning and Autoregression. ICLR 2024.
With Adam Block, Dylan Foster, Akshay Krishnamurthy, and Max Simchowitz.
With Akanksha Saran, Jacob Alber, Danielle Bragg, and John Langford.
Pareto Frontiers in Neural Feature Learning: Data, Compute, Width, and Luck. NeurIPS 2023 (spotlight).
With Benjamin L. Edelman, Surbhi Goel, Sham Kakade, and Eran Malach.
Exposing Attention Glitches with Flip-Flop Language Modeling. NeurIPS 2023 (spotlight).
With Bingbin Liu, Jordan T. Ash, Surbhi Goel, and Akshay Krishnamurthy.
With Sham Kakade, Akshay Krishnamurthy, and Gaurav Mahajan.
Transformers Learn Shortcuts to Automata. ICLR 2023 (oral).
With Bingbin Liu, Jordan T. Ash, Surbhi Goel, and Akshay Krishnamurthy.
With Surbhi Goel, Sham Kakade, and Adam Kalai.
With Boaz Barak, Benjamin L. Edelman, Surbhi Goel, Sham Kakade, and Eran Malach.
With Benjamin L. Edelman, Surbhi Goel, and Sham Kakade.
With Nikunj Saunshi, Jordan T. Ash, Surbhi Goel, Dipendra Misra, Sanjeev Arora, Sham Kakade, and Akshay Krishnamurthy.
With Yonathan Efroni, Sham Kakade, and Akshay Krishnamurthy.
With Jordan T. Ash, Surbhi Goel, Akshay Krishnamurthy, and Sham Kakade.
With Naman Agarwal and Surbhi Goel.
Machine Learning for Mechanical Ventilation Control. Extended abstract @ ML4H 2021.
With Daniel Suo, Naman Agarwal, Wenhan Xia, Xinyi Chen, Udaya Ghai, Alex Yu, Paula Gradu, Karan Singh, Edgar Minasyan, Julienne LaChance, Tom Zajdel, Manuel Schottdorf, Daniel Cohen, and Elad Hazan.
Deluca—A Differentiable Control Library: Environments, Methods, and Benchmarking. DiffCVGP workshop @ NeurIPS 2020.
With Paula Gradu, John Hallman, Daniel Suo, Alex Yu, Naman Agarwal, Udaya Ghai, Karan Singh, Anirudha Majumdar, and Elad Hazan.
Stochastic Optimization with Laggard Data Pipelines. NeurIPS 2020.
With Naman Agarwal, Rohan Anil, Tomer Koren, and Kunal Talwar.
With Udaya Ghai, Holden Lee, Karan Singh, and Yi Zhang.
With Mark Braverman, Xinyi Chen, Sham Kakade, Karthik Narasimhan, and Yi Zhang.
With Holden Lee.
With Xinyi Chen, Naman Agarwal, Elad Hazan, and Yi Zhang.
With Naman Agarwal, Rohan Anil, Elad Hazan, and Tomer Koren.
With Naman Agarwal, Brian Bullins, Xinyi Chen, Elad Hazan, Karan Singh, and Yi Zhang.
Spectral Filtering for General Linear Dynamical Systems. NeurIPS 2018 (oral).
With Elad Hazan, Holden Lee, Karan Singh, and Yi Zhang.
Not-So-Random Features. ICLR 2018.
With Brian Bullins and Yi Zhang.
Towards Provable Control in Unknown Linear Dynamical Systems. Workshop @ ICLR 2018.
With Sanjeev Arora, Elad Hazan, Holden Lee, Karan Singh, and Yi Zhang.
Learning Linear Dynamical Systems via Spectral Filtering. NeurIPS 2017 (spotlight).
With Elad Hazan and Karan Singh.
With Elad Hazan and Karan Singh.
With Matthew Giguere, Debra Fischer, Jaymie Matthews, Chris Cameron, and Gregory Henry.