You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
🎬 KeyVID: Keyframe-Aware Video Diffusion for Audio-Synchronized Visual Animation
Official repository for **KeyVID**, presented in **“KeyVID: Keyframe-Aware Video Diffusion for Audio-Synchronized Visual Animation.”**
This work introduces a unified diffusion framework that generates temporally coherent videos conditioned on audio, guided by adaptive keyframe localization.
📦 Release Plan
cd motion_scores/network
python main.py --mode predict
Keyframe generator
=======
Keyframe Localization — Coming soon
Keyframe Generation — Released
Interpolation Model — Released
Training Code — Coming soon
Checkpoints
⚙️ Environment Setup
We recommend using Python 3.10+ and PyTorch ≥ 2.1.
Quantitative evaluation (e.g., FID, FVD, AlignSync, RelSync) scripts will be added soon.
You can also visualize the output videos in the outputs/ directory for qualitative comparison.
📚 Citation
If you find this project useful, please cite:
@article{wang2025keyvid,
title={KeyVID: Keyframe-Aware Video Diffusion for Audio-Synchronized Visual Animation},
author={Wang, Xingrui and Liu, Jiang and Wang, Ze and Yu, Xiaodong and Wu, Jialian and Sun, Ximeng and Su, Yusheng and Yuille, Alan and Liu, Zicheng and Barsoum, Emad},
journal={arXiv preprint arXiv:2504.09656},
year={2025}
}
About
Offical code of paper KeyVID: Keyframe-Aware Video Diffusion for Audio-Synchronized Visual Animation.