VLAP: Efficient Video-Language Alignment via Frame Prompting and Distilling for Video Question Answering
ECCV 2024
| CARVIEW |
Tech Lead Amazon AI Studio | Ex-Tech Lead LLM for Amazon Product Search | Ex-Tech Lead Amazon Video Search, Amazon/A9 | Xoogler
I currently drive the Amazon AI Studio initiative while continuing to advance multi-modal LLMs and video generation research. Previously I led Amazon Video Search and the LLM reranker for Amazon Product Search, and my broader interests span multi-modality perception, 3D computer vision, and physically based simulation.
Before Amazon, I was a Senior Research SDE at Google Research focusing on multi-modal content creation and 3D vision. I completed my PhD at UNC-CH under Prof. Ming C. Lin.
Xijun Wang, Junbang Liang, Chun-Kai Wang, Kenan Deng, Yu Lou, Ming Lin, Shan Yang
ECCV 2024
Xijun Wang, Anqi Liang, Junbang Liang, Ming Lin, Yu Lou, Shan Yang
AAAI 2024
Muhammad Osama Khan, Junbang Liang, Chun-Kai Wang, Shan Yang, Yu Lou
NIPS 2023 Workshop SSLTheoryPractice
Akshay Gadi Patil, Yiming Qian, Shan Yang, Brian Jackson, Eric Bennett, Hao Zhang
In submission
Arsha Nagrani, Shan Yang, Anurag Arnab, Aren Jansen, Cordelia Schmid, Chen Sun
NeurIPS 2021
Shan Yang*, Bo Hu*, David A. Ross, Avneesh Sud, Yi Liu, Graham Ruby, Bryan Seybold
CVPR 2021 (CV4Animal Workshop)
Shan Yang, Vladimir Jojic, Jun Lian, Ronald Chen, Hongtu Zhu, Ming C. Lin
MICCAI 2016
Shan Yang, Wenlong Lu, Lixu Gu
CARS 2012
Supporting multimodal dance generation
Developer tools and loaders for the AIST++ dataset
Educational implementation of diffusion models
Interactive creation and editing app