| CARVIEW |
CVPR 2025, Nashville, Tennessee
Overview and Call for Papers
The Second Workshop on Efficient and On-Device Generation (EDGE) at CVPR 2025 will focus on the latest advancements of generative AI in the computer vision domain, with an emphasis on efficiencies across multiple aspects. We encourage techniques that enable generative models to be trained more efficiently and run on resource-constrained devices, such as mobile phones and edge devices. Through these efforts, we envision a future where these permeating generative AI capabilities become significantly more accessible with virtuous scalability and plateauing carbon footprint.
The topics involved in the workshop include but are not limited to:
- Modeling efficiency:
- Design of efficient modules and algorithms for generative models.
- Efficient architecture for generative models: auto-encoders, diffusion, GAN, autoregressive models, etc.
- Efficient transformers and state-space models.
- Training efficiency:
- Methods for reducing the memory footprint of training generative models.
- Using knowledge distillation to support model compression.
- Hardware-aware training acceleration and energy efficient computing.
- Online training under edge constraints.
- Parameter efficient tuning.
- Mixed pre-training to enable downstream tasks via lightweight tuning.
- Inference efficiency:
- Techniques for reducing the computational complexity of generative models, such as quantization, sparsification, and MoEs.
- Advanced sampling techniques for fast inference.
- Approaches for efficient deployment and distribution generative models across multiple devices or edge devices.
- Hardware-aware inference acceleration and energy efficient computing.
- Data efficiency:
- Few-shot diffusion model.
- Generative model finetuning.
- Training generative models with fewer data.
- Application-specific efficiency:
- Applications of efficient generative AI in computer vision, such as image / video generation, image / video editing, style transfer, and super-resolution.
- Efficient personalization and post-generation editing.
- Efficient generative models for real-time video synthesis: talking head, vitual try-on, gaming engine, etc.
Format: Submissions must use the CVPR 2025 Author Kit for Latex/Word Zip file and follow the CVPR 2025 author instructions and submission policies. Submissions need to be anonymized. A paper accepted to CVPR 2025 main conference can be resubmitted to our Extended Abstract track. A paper accepted by another venue can be resubmitted to our Extended Abstract track if allowed by that venue. Any submission to another CVPR 2025 Workshop cannot be resubmitted to EDGE 2025. The workshop considers two submission tracks:
- Long Paper: submissions are limited to 8 pages excluding references, no appendix or supplementary;
- Extended Abstract: submissions are limited to 4 pages excluding references, no appendix or supplementary.
Only long papers will be included in the CVPR 2025 proceedings.
===
Submission Site: https://openreview.net/group?id=thecvf.com/CVPR/2025/Workshop/EDGE
Submission Deadline: March 21, 2025 (AOE)
Workshop Date: June 12, 2025, 13:00-17:00
Workshop Location: Room: 208 A
Poster Session: June 12, 2025, 12:00-13:00 (before the workshop starts)
Poster Session Location: ExHall D, Board #202-214
Speakers

Stefano Ermon
Stanford

Lu Liu
OpenAI

Ziwei Liu
Nanyang Technological University

Ishan Misra
GenAI at Meta

Sergey Tulyakov
Snap Inc.

Jiaming Song
Luma AI

Enze Xie
NVIDIA

Jun-Yan Zhu
Carnegie Mellon University

Lu Jiang
ByteDance
Schedule
Each talk is assigned a 25 min slot (20 min talk + 4 min QA + 1 min buffer)| Time | Activity | Title |
|---|---|---|
| 13:00 - 13:10 | Opening Remarks and Award Announcement | |
| 13:10 - 13:35 | Ziwei Liu, Nanyang Technological University | "From Multimodal Generative Models to Dynamic World Modeling" |
| 13:35 - 14:00 | Stefano Ermon, Stanford University | "Accelerating Inference in Diffusion Models" |
| 14:00 - 14:25 | Ishan Misra, GenAI at Meta | "Scale Efficient Video Generation and Tokenization" |
| 14:25 - 14:50 | Sergey Tulyakov, Snap Inc. | "Sharpening the Edge: High Quality Image and Video Synthesis on Mobiles" |
| 14:50 - 14:55 | Break | |
| 14:55 - 15:20 | Jiaming Song, Luma AI | "Breaking the Algorithmic Ceiling in Pre-Training with a Inference-first Perspective" |
| 15:20 - 15:45 | Jun-Yan Zhu, Carnegie Mellon University | "Distilling Diffusion Models into Conditional GANs" |
| 15:45 - 16:10 | Lu Jiang, ByteDance | "Cost-Effective Training of Video Generation Foundation Model" |
| 16:10 - 16:35 | Enze Xie, Nvidia | "Building Image Generation model from scratch and Acceleration" |
| 16:35 - 17:00 | Lu Liu, OpenAI | "A Brief Introduction of 4o Image Generation" |
Accepted Papers
We are pleased to announce the accepted papers for the Second Workshop on Efficient and On-Device Generation (EDGE) at CVPR 2025. Congratulations to all authors!
Long Papers
-
Title: Geometric Consistency Refinement for Single Image Novel View Synthesis via Test-Time Adaptation of Diffusion Models
Authors: Josef Bengtson (Chalmers University of Technology), David Nilsson (Chalmers University of Technology), Fredrik Kahl (Chalmers University of Technology)
[Poster # 202] -
Title: AdaVid: Adaptive Video-Language Pretraining
Authors: Chaitanya Patel (Stanford University), Juan Carlos Niebles (Salesforce Research), Ehsan Adeli (Stanford University)
[Poster # 203] -
Title: LLMPi: Optimizing LLMs for High-Throughput on Raspberry Pi
Authors: Mahsa Ardakani (University of South Carolina), Jinendra Malekar (University of South Carolina), Ramtin Zand (University of South Carolina)
[Poster # 204] -
Title: Latent Patched Efficient Diffusion Model For High Resolution Image Synthesis
Authors: Weiyun Jiang (Rice University), Devendra Kumar Jangid (Samsung Research America), Seok-Jun Lee (Samsung Research America), Hamid Rahim Sheikh (Samsung Research America)
[Poster # 205] -
Title: ADAPTOR: Adaptive Token Reduction for Video Diffusion Transformers
Authors: Elia Peruzzo (University of Trento), Adil Karjauv (Qualcomm), Nicu Sebe (University of Trento), Amir Ghodrati (Qualcomm AI Research), Amir Habibian (Qualcomm)
[Poster # 206] -
Title: Scaling On-Device GPU Inference for Large Generative Models
Authors: Jiuqiang Tang (Google), Raman Sarokin (Google), Ekaterina Ignasheva (Meta), Grant Jensen (Google), Lin Chen (Google), Juhyun Lee (Google), Andrei Kulik (Google), Matthias Grundmann (Google)
[Poster # 207]
Short Papers
-
Title: Random Conditioning for Diffusion Model Compression with Distillation
Authors: Dohyun Kim (Korea University), Sehwan Park (Korea University), Geonhee Han (Korea University), Seung Wook Kim (Nvidia), Paul Hongsuck Seo (Korea University)
[Poster # 208] -
Title: LatentLLM: Attention-Aware Joint Tensor Compression
Authors: Toshiaki Koike-Akino (Mitsubishi Electric Research Labs), Xiangyu Chen (Mitsubishi Electric Research Labs), Jing Liu (Mitsubishi Electric Research Labs), Ye Wang (Mitsubishi Electric Research Labs), Pu Perry Wang (Mitsubishi Electric Research Labs), Matthew Brand (Mitsubishi Electric Research Labs)
[Poster # 209] -
Title: MS-Temba: Multi-Scale Temporal Mamba for Efficient Temporal Action Detection
Authors: Arkaprava Sinha (University of North Carolina at Charlotte), Monish Soundar Raj (University of North Carolina at Charlotte), Pu Wang (University of North Carolina at Charlotte), Ahmed Helmy (University of North Carolina at Charlotte), Srijan Das (University of North Carolina at Charlotte)
[Poster # 210] -
Title: TuneComp: Joint Fine-Tuning and Compression for Large Foundation Models
Authors: Xiangyu Chen (Mitsubishi Electric Research Labs), Jing Liu (Mitsubishi Electric Research Labs), Ye Wang (Mitsubishi Electric Research Labs), Matthew Brand (Mitsubishi Electric Research Labs), Pu Perry Wang (Mitsubishi Electric Research Labs), Toshiaki Koike-Akino (Mitsubishi Electric Research Labs)
[Poster # 211] -
Title: Efficient Personalization of Quantized Diffusion Model without Backpropagation
Authors: Hoigi Seo (Seoul National University), Wongi Jeong (Seoul National University), Kyungryeol Lee (Seoul National University), Se Young Chun (Seoul National University)
[Poster # 212] -
Title: Atlas: Multi-Scale Attention Improves Long Context Image Modeling
Authors: Kumar Krishna Agrawal (University of California, Berkeley), Long Lian (University of California, Berkeley), Longchao Liu (University of California, Berkeley), Natalia Harguindeguy (University of California, Berkeley), Boyi Li (Nvidia Research), Alexander G Bick (Vanderbilt University), Maggie Chung (University of California, San Francisco), Trevor Darrell (University of California, Berkeley), Adam Yala (University of California, Berkeley)
[Poster # 213] -
Title: LoRA.rar: Learning to Merge LoRAs via Hypernetworks for Subject-Style Conditioned Image Generation
Authors: Donald Shenaj (Samsung R&D Institute UK (SRUK), University of Padova), Ondrej Bohdal (Samsung R&D Institute UK (SRUK)), Mete Ozay (Samsung R&D Institute UK (SRUK)), Pietro Zanuttigh (University of Padova), Umberto Michieli (Samsung R&D Institute UK (SRUK))
[Poster # 214]
Awards
We are pleased to announce two paper awards at CVPR 2025 EDGE workshop, which will be sponsored by PixVerse. Congratulations to the award winners!
Best Paper Award
-
Title: AdaVid: Adaptive Video-Language Pretraining
Authors: Chaitanya Patel (Stanford University), Juan Carlos Niebles (Salesforce Research), Ehsan Adeli (Stanford University)
[Poster # 203]
Best Paper Runner-Up Award
-
Title: Scaling On-Device GPU Inference for Large Generative Models
Authors: Jiuqiang Tang (Google), Raman Sarokin (Google), Ekaterina Ignasheva (Meta), Grant Jensen (Google), Lin Chen (Google), Juhyun Lee (Google), Andrei Kulik (Google), Matthias Grundmann (Google)
[Poster # 207]
Organizing Committee

Felix Juefei-Xu*
GenAI, Meta

Tingbo Hou*
Google DeepMind

Yang Zhao
Google DeepMind

Licheng Yu
GenAI, Meta

Zhisheng Xiao
Google DeepMind

Xiaoliang Dai
GenAI, Meta

Qifei Wang
Google DeepMind

Tao Xu
GenAI, Meta

Yanwu Xu

Ali Thabet
GenAI, Meta

Qiang Liu
UT Austin

Xuan Ju
CUHK

Ruiqi Gao
Google DeepMind

Xi Yin
GenAI, Meta

Haolin Jia

Xide Xia
GenAI, Meta

Peizhao Zhang
GenAI, Meta

Peter Vajda
GenAI, Meta
Important Dates
- Submission Deadline: March 21, 2025 (AOE)
- Author Notification: April 1, 2025 (tentative)
- Camera-Ready Deadline: TBD
- Workshop Time and Location: June 12, 2025, 13:00-17:00, Room: 208 A
Sponsor
PixVerse is one of world’s largest GenAI platforms with over 70 million users worldwide and driven by in-house developed video generation models that deliver superior quality and efficiency. PixVerse aims to democratize video creation by enabling the billions of viewers—who have never made a video—to produce their first share-worthy video with AI.