I completed my Ph.D. degree in 2019 at Machine Learning Department, Carnegie Mellon University, where I was fortunate to be co-advised by Maria-Florina Balcan and David P. Woodruff. Before joining Waterloo, I was a Postdoc fellow at Toyota Technological Institute at Chicago (TTIC), hosted by Avrim Blum and Greg Shakhnarovich. I graduated from Peking University in 2015. I also spent time at Simons Institute and IPAM.
Machine Learning, Inference Acceleration, AI Security, World Models. Current research includes:
Inference Acceleration: Pioneering advanced algorithms for enhancing AI inference speed. Drastically lowering the deployment costs of foundation models, making them more accessible and efficient. [e.g., EAGLE: the fastest-known speculative sampling]
System-2 LLMs: Creating universal algorithms focusing on test-time compute of LLMs for boosted reasoning and alignment. [e.g., RAIN: LLM alignment without finetuning]
AI Security: Building foundations of defenses against adversarial attacks, o.o.d. attacks, and privacy attacks. Developing watermarking techniques and prioritizing copyright and privacy protection in LLMs. [e.g., TRADES: SOTA adversarial training methodology with 5-year time test]
World Models and Agents: Developing world models that are able to imagine and generate data to train agents for real-world applications such as robotics and self-driving cars. We follow a path of AAA Games -> World Models -> Real World. [e.g., The Matrix: an infinite-length, hyper-realistic world model with real-time, frame-level control]
Yuhui Li, Fangyun Wei, Chao Zhang, Hongyang Zhang. "EAGLE-3: Scaling up Inference Acceleration of Large Language Models via Training-Time Test", NeurIPS 2025, San Diego, USA. [arXiv] [code]
Yuhui Li, Fangyun Wei, Chao Zhang, Hongyang Zhang. "EAGLE-2: Faster Inference of Language Models with Dynamic Draft Trees", EMNLP 2024, Miami, USA. [arXiv] [code]
Yu Du, Fangyun Wei, Hongyang Zhang. "AnyTool: Self-Reflective, Hierarchical Agents for Large-Scale API Calls", ICML 2024, Vienna, Austria. [arXiv] [code]
Yuhui Li, Fangyun Wei, Jinjing Zhao, Chao Zhang, Hongyang Zhang. "RAIN: Your Language Models Can Align Themselves without Finetuning", ICLR 2024, Vienna, Austria. [arXiv] [code]
AI Security
Haochen Sun, Jason Li, Hongyang Zhang. "zkLLM: Zero Knowledge Proofs for Large Language Models", ACM CCS 2024, Salt Lake City, USA. [pdf]
Avrim Blum, Travis Dick, Naren Manoj, Hongyang Zhang. "Random Smoothing Might be Unable to Certify L∞ Robustness for High-Dimensional Images", Journal of Machine Learning Research 21, 2020. [pdf] [code]
Hongyang Zhang, Yaodong Yu, Jiantao Jiao, Eric P. Xing, Laurent El Ghaoui, Michael I. Jordan. "Theoretically Principled Trade-off between Robustness and Accuracy", ICML 2019 (Long Talk), Long Beach, USA. [arXiv] [code] (Champion of NeurIPS 2018 Adversarial Vision Challenge; Top-10 most influential paper in ICML 2019-2024 according to Google Scholar Metrics)
Learning Theory
Maria-Florina Balcan, Yi Li, David P. Woodruff, Hongyang Zhang. "Testing Matrix Rank, Optimally", SODA 2019, San Diego, USA. [pdf] [arXiv]
Pranjal Awasthi, Maria-Florina Balcan, Nika Haghtalab, Hongyang Zhang. “Learning and 1-bit Compressed Sensing under Asymmetric Noise", COLT 2016, New York, USA. [pdf]
World Models
Ruili Feng, Han Zhang, Zhantao Yang, Jie Xiao, Zhilei Shu, Zhiheng Liu, Andy Zheng, Yukun Huang, Yu Liu, Hongyang Zhang. "The Matrix: Infinite-Horizon World Generation with Real-Time Moving Control", NeurIPS 2025, San Diego, USA. [paper] [blog]
Books
Zhouchen Lin, Hongyang Zhang. “Low Rank Models in Visual Analysis: Theories, Algorithms and Applications”, Academic Press, Elsevier, 2017. [Press link]
Table of Contents
Introduction
Linear Models (Single Subspace Models, Multiple-Subspace Models, Theoretical Analysis)
Non-Linear Models (Kernel Methods, Laplacian and Hyper-Laplacian Methods, Locally Linear Representation, Transformation Invariant Clustering)
(Senior) Area Chair: ICML 2026 (Senior Area Chair), ICLR 2026, ALT 2026, NeurIPS 2025 (Top Area Chairs), ACL 2025, ICML 2025 (Outstanding Meta Reviewer), AISTATS 2025, ICLR 2025, ALT 2025, PRCV 2025 (Senior Area Chair), NeurIPS 2024 (Top Area Chairs), ICML 2024 (TF2M), ICLR 2024, AISTATS 2024, PRCV 2024 (Senior Area Chair), NeurIPS 2023, AAAI 2022, AAAI 2021, VALSE 2021-2025.
Associate Editor: IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), Data-centric Machine Learning Research (DMLR), ACM Transactions on Probabilistic Machine Learning.
Membership: IEEE Senior Member, ACM Member, AAAI Member.
Journal Referee: Annals of Statistics, Journal of the American Statistical Association, Mathematical Reviews, Journal of Machine Learning Research, Machine Learning, International Journal of Computer Vision, Proceedings of the IEEE, IEEE Journal of Selected Topics in Signal Processing, IEEE Transactions on Information Theory, IEEE Transactions on Pattern Analysis and Machine Intelligence, IEEE Transactions on Signal Processing, IEEE Transactions on Information Forensics & Security, IEEE Transactions on Neural Networks and Learning Systems, IEEE Transactions on Cybernetics, IEEE Transactions on Dependable and Secure Computing, IEEE Signal Processing Letters, IEEE Access, Neurocomputing, ACM Transactions on Knowledge Discovery from Data, Management Science.
Conference Referee: STOC 2026, CVPR 2026, ITCS 2026, ALT 2026, SODA 2025, ALT 2025, CVPR 2025, STOC 2024, COLT 2024, ESA 2023, ACM CCS 2023, ALT 2023, IEEE SaTML 2023, MSML 2022, ALT 2022, IEEE Conference on Decision and Control 2021, NeurIPS 2021, ICLR 2021, AAAI 2020, STOC 2020, ICML 2020, NeurIPS 2020, FOCS 2020, AISTATS 2019, ITCS 2019, NeurIPS 2019, AAAI 2018, STOC 2018, ISIT 2018, ICML 2018, COLT 2018, NeurIPS 2018, APPROX 2018, ACML 2018, IJCAI 2017, STOC 2017, NIPS 2017, AAAI 2016, ICML 2016, NIPS 2016.
Talks
The EAGLE Series: Lossless Inference Acceleration for LLMs. Weizmann Institute, UIUC, 2024. UCSD, Google Research NYC 2025. [slide] [video 1] [video 2]
New Advances in Safe, Self-evolving, and Efficient Large Language Models. AAAI New Faculty Highlights, UBC CAIDA, Google, 2024. [video]
AI Safety by the People, for the People. HKU, HKUST, 2023.
zkDL: Efficient Zero-Knowledge Proofs of Deep Learning. Peking University, Tsinghua University (IIIS), Intellect Horizon, 2023.
New Advances in (Adversarially) Robust and Secure Machine Learning. Qualcomm, Peking University, UMN, Yale, Waterloo, UChicago, NUS, MPI, USC, GaTech, Duke, BAAI 2021 2022. [slide]
Theoretically Principled Trade-off between Robustness and Accuracy. Simons Institute, IPAM, TTIC, Caltech, CMU, ICML 2019, ICML Workshop on the Security and Privacy of Machine Learning, Peking University. [slide] [video]
Testing Matrix Rank, Optimally. SODA 2019. [slide]
Testing and Learning from Big Data, Optimally. CMU AI Lunch 2018. [slide]
New Paradigms and Global Optimality in Non-Convex Optimization. CMU Theory Lunch 2017. [slide] [video]
Active Learning of Linear Separators under Asymmetric Noise. Asilomar 2017. [slide]