I am a third-year Ph.D. student at Computer Science Department of Carnegie Mellon University. I’m a member of the CMU Catalyst research group, and fortunate to be advised by Prof. Zhihao Jia.
Prior to joining CMU, I received my Master degree at the Robotics Institute of Carnegie Mellon University and B.Sc in Computer Science at Renmin University of China, where I have been advised Prof. Changliu Liu and Prof. Qin Jin.
@article{zhang2024accelerating,title={Accelerating retrieval-augmented language model serving with speculation},author={Zhang, Zhihao and Zhu, Alan and Yang, Lijie and Xu, Yihua and Li, Lanting and Phothilimthana, Phitchaya Mangpo and Jia, Zhihao},journal={To appear at ICML 2024},year={2024},}
ASPLOS
Specinfer: Accelerating generative llm serving with speculative inference and token tree verification
Xupeng Miao * , Gabriele Oliaro * , Zhihao Zhang* , and 7 more authors
@article{miao2023specinfer,title={Specinfer: Accelerating generative llm serving with speculative inference and token tree verification},author={Miao, Xupeng and Oliaro, Gabriele and Zhang, Zhihao and Cheng, Xinhao and Wang, Zeyu and Wong, Rae Ying Yee and Chen, Zhuoming and Arfeen, Daiyaan and Abhyankar, Reyna and Jia, Zhihao},journal={To appear at ASPLOS 2024},year={2023},cofirst={4}}
ICLR
GradSign: Model Performance Inference with Theoretical Insights
Zhihao Zhang, and Zhihao Jia
In International Conference on Learning Representations , 2021
@inproceedings{zhang2021gradsign,title={GradSign: Model Performance Inference with Theoretical Insights},author={Zhang, Zhihao and Jia, Zhihao},booktitle={International Conference on Learning Representations},year={2021},}
NeurIPS
Communication Bounds for the Distributed Experts Problem
Zhihao Jia , Qi Pang , Trung Tran , and 3 more authors (in alphabetic order)
In The Thirty-eighth Annual Conference on Neural Information Processing Systems , 2024
ICLR
TidalDecode: Fast and Accurate LLM Decoding with Position Persistent Sparse Attention
Lijie Yang * , Zhihao Zhang* , Zhuofu Chen , and 2 more authors