| CARVIEW |
Select Language
HTTP/2 200
accept-ranges: bytes
age: 1
cache-control: public,max-age=0,must-revalidate
cache-status: "Netlify Edge"; fwd=miss
content-encoding: gzip
content-type: text/html; charset=UTF-8
date: Sun, 28 Dec 2025 21:40:43 GMT
etag: "f258b9e1552b38cf81b759628d79dbed-ssl-df"
server: Netlify
strict-transport-security: max-age=31536000
vary: Accept-Encoding
x-nf-request-id: 01KDKEG3KB0PQ9BKWBEMZTXBPY
Academic Google
Biography
My research interests focus on Inclusive AI: Make AI more open, robust, widely deployable, and efficient in inference:
- DNN Efficiency for CNNs, Diffusion Models, and LLMs
- Pruning, architecture search, quantization, token reduction, etc.
- Achieve real-time inference for DNN models on edge devices
- DNN Robustness
- Explore various DNN attack methods
- Investigate defense methods against various DNN attacks
- LLM Openness
- Train fully open-source LLMs
- Adopt various LLM post-train techniques, such as SFT, DPO, GRPO, etc.
- Optimization Methods
- Bi-level optimization, ADMM, Bayesian optimization, second-order optimization, zeroth-order optimization, min-max optimization etc.
Interests
- Artificial Intelligence
- Adversarial Robustness
- Model Pruning & Quantization
- Token Reduction
- Real-Time Deployment
- Optimization Methods
- Diffusion Models and LLMs
Education
PhD in Computer Engineering, 2021
Northeastern University
MEng in Electrical Engineering, 2016
Shanghai Jiao Tong University
BSc in Electrical Engineering, 2013
Shanghai Jiao Tong University
Recent Posts
Featured Publications
LazyDiT: Lazy Learning for the Acceleration of Diffusion Transformers
Diffusion Transformers have emerged as the preeminent models for a wide array of generative tasks, demonstrating superior performance …
Xuan Shen, Zhao Song, Yufa Zhou, Bo Chen, Yanyu Li, Yifan Gong, Kai Zhang, Hao Tan, Jason Kuen, Henghui Ding, Zhihao Shu, Wei Niu, Pu Zhao, Yanzhi Wang, Jiuxiang Gu
Numerical Pruning for Efficient Autoregressive Models
Transformers have emerged as the leading architecture in deep learning, proving to be versatile and highly effective across diverse …
Xuan Shen, Zhao Song, Yufa Zhou, Bo Chen, Jing Liu, Ruiyi Zhang, Ryan A- Rossi, Hao Tan, Tong Yu, Xiang Chen, Yufan Zhou, Tong Sun, Pu Zhao, Yanzhi Wang, Jiuxiang Gu
Exploring Token Pruning in Vision State Space Models
State Space Models (SSMs) have the advantage of keeping linear computational complexity compared to attention modules in transformers, …
Zheng Zhan, Zhenglun Kong, Yifan Gong, Yushu Wu, Zichong Meng, Hangyu Zheng, Xuan Shen, Stratis Ioannidis, Wei Niu, Pu Zhao, Yanzhi Wang
Fast and Memory-Efficient Video Diffusion Using Streamlined Inference
The rapid progress in artificial intelligence-generated content (AIGC), especially with diffusion models, has significantly advanced …
Zheng Zhan, Yushu Wu, Yifan Gong, Zichong Meng, Zhenglun Kong, Changdi Yang, Geng Yuan, Pu Zhao, Wei Niu, Yanzhi Wang
Rethinking Token Reduction for State Space Models
Recent advancements in State Space Models (SSMs) have attracted significant interest, particularly in models optimized for parallel …
Zheng Zhan, Yushu Wu, Zhenglun Kong, Changdi Yang, Yifan Gong, Xuan Shen, Xue Lin, Pu Zhao, Yanzhi Wang
DiffClass: Diffusion-Based Class Incremental Learning
Class Incremental Learning (CIL) is challenging due to catastrophic forgetting. On top of that, exemplar-free CIL is even more …
Zichong Meng, Jie Zhang, Changdi Yang, Zheng Zhan, Pu Zhao, Yanzhi Wang
InstructGIE: Towards Generalizable Image Editing
Recent advances in image editing have been driven by the development of denoising diffusion models, marking a significant leap forward …
Zichong Meng, Changdi Yang, Jun Liu, Hao Tang, Pu Zhao, Yanzhi Wang
Condense: A Framework for Device and Frequency Adaptive Neural Network Models on the Edge
With the popularity of battery-powered edge computing, an important yet under-explored problem is the supporting of DNNs for diverse …
Yifan Gong, Pu Zhao, Zheng Zhan, Yushu Wu, Chao Wu, Zhenglun Kong, Minghai Qin, Caiwen Ding, Yanzhi Wang
All-in-One: A Highly Representative DNN Pruning Framework for Edge Devices with Dynamic Power Management
During the deployment of deep neural networks (DNNs) on edge devices, many research efforts are devoted to the limited hardware …
Yifan Gong, Zheng Zhan, Pu Zhao, Yushu Wu, Chao Wu, Caiwen Ding, Weiwen Jiang, Minghai Qin, Yanzhi Wang
CoCoPIE: Enabling Real-Time AI on Off-the-Shelf Mobile Devices via Compression-Compilation Co-Design
Assuming hardware is the major constraint for enabling real-time mobile intelligence, the industry has mainly dedicated their efforts …
Hui Guan, Shaoshan Liu, Xiaolong Ma, Wei Niu, Bin Ren, Xipeng Shen, Yanzhi Wang, Pu Zhao
Towards Real-Time DNN Inference on Mobile Platforms with Model Pruning and Compiler Optimization
High-end mobile platforms rapidly serve as primary computing devices for a wide range of Deep Neural Network (DNN) applications. …
Wei Niu, Pu Zhao, Zheng Zhan, Xue Lin, Yanzhi Wang, Bin Ren
Bridging Mode Connectivity in Loss Landscapes and Adversarial Robustness
Mode connectivity provides novel geometric insights on analyzing loss landscapes and enables building high-accuracy pathways between …
Pu Zhao, Pin-Yu Chen, Payel Das, Karthikeyan Natesan Ramamurthy, Xue Lin
Structured Adversarial Attack: Towards General Implementation and Better Interpretability
When generating adversarial examples to attack deep neural networks (DNNs), Lp norm of the added perturbation is usually used to …
Kaidi Xu, Sijia Liu, Pu Zhao, Pin-Yu Chen, Huan Zhang, Quanfu Fan, Deniz Erdogmus, Yanzhi Wang, Xue Lin
Recent Publications
Quickly discover relevant content by filtering publications.
Lin Zhao, Yushu Wu, Xinru Jiang, Jianyang Gu, Yanzhi Wang, Xiaolin Xu, Pu Zhao, Xue Lin
(2025).
Taming Diffusion for Dataset Distillation with High Representativeness.
In ICML 2025.
Xuan Shen, Weize Ma, Jing Liu, Changdi Yang, Rui Ding, Quanyi Wang, Henghui Ding, Wei Niu, Yanzhi Wang, Pu Zhao, Jun Lin, Jiuxiang Gu
(2025).
QuartDepth: Post-Training Quantization for Real-Time Depth Estimation on the Edge.
In CVPR 2025.
Xuan Shen, Hangyu Zheng, Yifan Gong, Zhenglun Kong, Changdi Yang, Zheng Zhan, Yushu Wu, Xue Lin, Yanzhi Wang, Pu Zhao, Wei Niu
(2025).
Sparse Learning for State Space Models on Mobile.
In ICLR 2025.
Jun Liu, Zhenglun Kong, Peiyan Dong, Xuan Shen, Pu Zhao, Hao Tang, Geng Yuan, Wei Niu, Wenbin Zhang, Xue Lin, Dong Huang, Yanzhi Wang
(2025).
RoRA: Efficient Fine-Tuning of LLM with Reliability Optimization for Rank Adaptation.
In ICASSP 2025.
Jun Liu, Zhenglun Kong, Pu Zhao, Changdi Yang, Hao Tang, Xuan Shen, Geng Yuan, Wei Niu, Wenbin Zhang, Xue Lin, Dong Huang, Yanzhi Wang
(2025).
Toward Adaptive Large Language Models Structured Pruning via Hybrid-grained Weight Importance Assessment.
In AAAI 2025.
Pinrui Yu, Zhenglun Kong, Pu Zhao, Peiyan Dong, Hao Tang, Fei Sun, Xue Lin, Yanzhi Wang
(2025).
Q-TempFusion: Quantization-Aware Temporal Multi-Sensor Fusion on Bird's-Eye View Representation.
In WACV 2025.
Xuan Shen, Pu Zhao, Yifan Gong, Zhenglun Kong, Zheng Zhan, Yushu Wu, Ming Lin, Chao Wu, Xue Lin, Yanzhi Wang
(2024).
Search for Efficient Large Language Models.
In NeurIPS 2024.
Pu Zhao, Fei Sun, Xuan Shen, Pinrui Yu, Zhenglun Kong, Yanzhi Wang, Xue Lin
(2024).
Pruning Foundation Models for High Accuracy without Retraining.
In EMNLP 2024 Findings.
Jun Liu, Zhenglun Kong, Pu Zhao, Weihao Zeng, Hao Tang, Xuan Shen, Changdi Yang, Wenbin Zhang, Geng Yuan, Wei Niu, Xue Lin, Yanzhi Wang
(2024).
TSLA: A Task-Specific Learning Adaptation for Semantic Segmentation on Autonomous Vehicles Platform.
In TCAD.
Xuan Shen, Zhaoyang Han, Lei Lu, Zhenglun Kong, Peiyan Dong, Zhengang Li, Yanyue Xie, Chao Wu, Miriam Leeser, Pu Zhao, Xue Lin, Yanzhi Wang
(2024).
HotaQ: Hardware Oriented Token Adaptive Quantization for Large Language Models.
In TCAD.
Demos
Accomplishments
Third Place in U.S. Department of Transportation’s Inclusive Design Challenge
U.S. Department of Transportation
Aug 2022
Third Place in the second phase of the national 2022 Inclusive Design Challenge from U.S. Department of Transportation
See certificateFirst Place in ISLPED 2020 Design Contest
ISLPED
Aug 2020
CoCoPIE: A Framework of Compression-Compilation Co-design Towards Ultra-High Energy Efficiency and Real Time DNN Inference on Mobile Devices
See certificateBest Paper Nomination in ICCAD 2018
ICCAD
Mar 2018
Defensive Dropout for Hardening Deep Neural Networks under Adversarial Attacks
National Scholarship for Postgraduate Students
Shanghai Jiao Tong University
Oct 2015
National Scholarship for Postgraduate Students
Shanghai Jiao Tong University
Oct 2014
Academic Excellence Scholarship for Postgraduate Students
Shanghai Jiao Tong University
Jan 2014
Academic Excellence Scholarship for Postgraduate Students
Shanghai Jiao Tong University
Jan 2013
Academic Excellence Scholarship
Shanghai Jiao Tong University
Jan 2012
Academic Excellence Scholarship
Shanghai Jiao Tong University
Jan 2011
First Prize at 26th National Physics Contest for College Students
Shanghai Jiao Tong University
Dec 2009
Experience
Research Assistant Professor
Northeastern University
Sep 2021 –
Present
Boston, MA
Research Internship
May 2020 –
Sep 2020
California
Explore the image search with text assistance.
Research Fall Internship-Graduate
IBM Thomas J. Watson Research Center
Dec 2018 –
Mar 2019
New York
Investigate the defense performance of model connection against three kinds of attacks including backdoor attack, fault injection attack and adversarial attack.
Teaching Assistance
Computer Architecture
Sep 2017 –
Dec 2017
Boston
Help with the teaching of computer architecture such as designing the course project and answering questions during office hours.
Research Assistance
Computer Engineering
Sep 2016 –
Aug 2021
Boston
Research on adversarial robustness and model efficiency.
Contact
- p.zhao@northeastern.edu
- 140 the Fenway, Boston, MA 02120
- R306
















