About Me

Hi 👋, I am a researcher at DeepSeek. I studied my PhD at ZIP Lab, Monash University, supervised by Prof. Bohan Zhuang and Prof. Jianfei Cai. Previously I was a master student at the University of Adelaide. Prior to that, I received my bachelors’ degree from Harbin Institute of Technology, Weihai, a beautiful coastal campus 🏖️, which left me with cherished memories.

My research during my PhD focused on efficiency in deep neural networks, including training, inference, and deployment. Some topics that I previously focused on:

  • Flexible model deployment: SN-Net, SN-Netv2
  • Transformer architcture optimization: LIT, LITv2
  • Efficient attention mechansims: HiLo, EcoFormer
  • Token pruning/merging for inference speedup: HVT
  • Memory-efficient training: Mesa

Since 2024, I’ve been working on Multimodal LLMs at DeepSeek. If you are working on the following topics and interested in internship/full-time opportunities at DeepSeek, please feel free to contact me via email. We are always looking for talents!

  • VLM data and training.
  • Math/Code/LLM Alignment.
  • Agent Research/Infra.

Education

  • Ph.D in Computer Science, Monash University, 2021 - 2024.
  • M.S. in Computer Science, The University of Adelaide, 2018 - 2020.
  • B.E. in Software Engineering, Harbin Institute of Technology, Weihai, 2015 - 2019.

Work Experience

  • Full-time researcher at DeepSeek, July, 2024 - Now.
  • Research intern at NVIDIA, AI Algorithm Group, July, 2023 - Oct, 2023

Selected Publications

Stitched ViTs are Flexible Vision Backbones
Zizheng Pan, Jing Liu, Haoyu He, Jianfei Cai, Bohan Zhuang
European Conference on Computer Vision (ECCV), 2024
[Paper] [Code] [Project Page]

MiniCache: KV Cache Compression in Depth Dimension for Large Language Models
Akide Liu, Jing Liu, Zizheng Pan, Yefei He, Gholamreza Haffari, Bohan Zhuang
Conference on Neural Information Processing Systems (NeurIPS), 2024
[Paper]

T-Stitch: Accelerating Sampling in Pre-trained Diffusion Models with Trajectory Stitching
Zizheng Pan, Bohan Zhuang, De-An Huang, Weili Nie, Zhiding Yu, Chaowei Xiao, Jianfei Cai, Anima Anandkumar
ArXiv, 2024. [Paper] [Code] [Project Page]

Stitchable Neural Networks
Zizheng Pan, Jianfei Cai, Bohan Zhuang
Conference on Computer Vision and Pattern Recognition (CVPR), 2023 (Highlight)
[Code], [Paper], [Project Page]

Fast Vision Transformers with HiLo Attention
Zizheng Pan, Jianfei Cai, Bohan Zhuang
Conference on Neural Information Processing Systems (NeurIPS), 2022 (Spotlight)
[Code], [Paper], [OpenReview]

EcoFormer: Energy-Saving Attention with Linear Complexity
Jing Liu*, Zizheng Pan*, Haoyu He, Jianfei Cai, Bohan Zhuang (*equal contribution)
Conference on Neural Information Processing Systems (NeurIPS), 2022 (Spotlight)
[Code], [Paper], [OpenReview]

Less is More: Pay Less Attention in Vision Transformers
Zizheng Pan, Bohan Zhuang, Haoyu He, Jing Liu, Jianfei Cai
AAAI Conference on Artificial Intelligence (AAAI), 2022
[Code], [Paper]

Scalable Visual Transformers with Hierarchical Pooling
Zizheng Pan, Bohan Zhuang, Jing Liu, Haoyu He, Jianfei Cai
International Conference on Computer Vision (ICCV), 2021
[Code], [Paper]

Teaching

  • FIT5201 - Machine learning, 2022, TA

Talk

  • 2023.10.25   “Optimizing Vision Transformers for Efficient Training, Inference and Deployment”, invited online talk at University of Massachusetts Amherst.

Professional Activities

Reviewer: CVPR, ICCV, ECCV, NeurIPS, ICML, ICLR, IJCV

Awards

  • Google Travel and Conference Grants, 2023
  • Monash Graduate Scholarship, 2020
  • Adelaide Summer Research Scholarship, 2019
  • Outstanding Graduate in Harbin Institute of Technology, 2019