About Me
Hi 👋, I am a researcher at DeepSeek. I studied my PhD at ZIP Lab, Monash University, supervised by Prof. Bohan Zhuang and Prof. Jianfei Cai. Previously I was a master student at the University of Adelaide. Prior to that, I received my bachelors’ degree from Harbin Institute of Technology, Weihai, a beautiful coastal campus 🏖️, which left me with cherished memories.
My research during my PhD focused on efficiency in deep neural networks, including training, inference, and deployment. Some topics that I previously focused on:
- Flexible model deployment: SN-Net, SN-Netv2
- Transformer architcture optimization: LIT, LITv2
- Efficient attention mechansims: HiLo, EcoFormer
- Token pruning/merging for inference speedup: HVT
- Memory-efficient training: Mesa
Since 2024, I’ve been working on Multimodal LLMs at DeepSeek. If you are working on the following topics and interested in internship/full-time opportunities at DeepSeek, please feel free to contact me via email. We are always looking for talents!
- VLM data and training.
- Math/Code/LLM Alignment.
- Agent Research/Infra.
Education
- Ph.D in Computer Science, Monash University, 2021 - 2024.
- M.S. in Computer Science, The University of Adelaide, 2018 - 2020.
- B.E. in Software Engineering, Harbin Institute of Technology, Weihai, 2015 - 2019.
Work Experience
- Full-time researcher at DeepSeek, July, 2024 - Now.
- Research intern at NVIDIA, AI Algorithm Group, July, 2023 - Oct, 2023
Selected Publications

- Stitched ViTs are Flexible Vision Backbones
- Zizheng Pan, Jing Liu, Haoyu He, Jianfei Cai, Bohan Zhuang
- European Conference on Computer Vision (ECCV), 2024
- [Paper] [Code] [Project Page]

- MiniCache: KV Cache Compression in Depth Dimension for Large Language Models
- Akide Liu, Jing Liu, Zizheng Pan, Yefei He, Gholamreza Haffari, Bohan Zhuang
- Conference on Neural Information Processing Systems (NeurIPS), 2024
- [Paper]

- T-Stitch: Accelerating Sampling in Pre-trained Diffusion Models with Trajectory Stitching
- Zizheng Pan, Bohan Zhuang, De-An Huang, Weili Nie, Zhiding Yu, Chaowei Xiao, Jianfei Cai, Anima Anandkumar
- ArXiv, 2024. [Paper] [Code] [Project Page]

- Stitchable Neural Networks
- Zizheng Pan, Jianfei Cai, Bohan Zhuang
- Conference on Computer Vision and Pattern Recognition (CVPR), 2023 (Highlight)
- [Code], [Paper], [Project Page]

- Fast Vision Transformers with HiLo Attention
- Zizheng Pan, Jianfei Cai, Bohan Zhuang
- Conference on Neural Information Processing Systems (NeurIPS), 2022 (Spotlight)
- [Code], [Paper], [OpenReview]

- EcoFormer: Energy-Saving Attention with Linear Complexity
- Jing Liu*, Zizheng Pan*, Haoyu He, Jianfei Cai, Bohan Zhuang (*equal contribution)
- Conference on Neural Information Processing Systems (NeurIPS), 2022 (Spotlight)
- [Code], [Paper], [OpenReview]

- Less is More: Pay Less Attention in Vision Transformers
- Zizheng Pan, Bohan Zhuang, Haoyu He, Jing Liu, Jianfei Cai
- AAAI Conference on Artificial Intelligence (AAAI), 2022
- [Code], [Paper]

- Scalable Visual Transformers with Hierarchical Pooling
- Zizheng Pan, Bohan Zhuang, Jing Liu, Haoyu He, Jianfei Cai
- International Conference on Computer Vision (ICCV), 2021
- [Code], [Paper]
Teaching
- FIT5201 - Machine learning, 2022, TA
Talk
- 2023.10.25 “Optimizing Vision Transformers for Efficient Training, Inference and Deployment”, invited online talk at University of Massachusetts Amherst.
Professional Activities
Reviewer: CVPR, ICCV, ECCV, NeurIPS, ICML, ICLR, IJCV
Awards
- Google Travel and Conference Grants, 2023
- Monash Graduate Scholarship, 2020
- Adelaide Summer Research Scholarship, 2019
- Outstanding Graduate in Harbin Institute of Technology, 2019
