| CARVIEW |

About
I am a Research Scientist at Luma AI, working on multimodal foundation models.
I obtained my Ph.D. in Computer Vision at the University of Michigan, where I was
advised by Prof. Andrew Owens. Prior to that, I earned my Master's degree at the University of Michigan and my Bachelor's degree at Shanghai Jiao Tong University.
My primary research interest lies in Multimodal learning and Computer Vision. I have spent wonderful years in computer vision group @ Michigan. During my Ph.D. I have been fortunate to intern at Codec Avatars Lab of Meta and SODA (Sound Design AI) Group of Adobe Research. I am honored to collaborate with Prof. David Fouhey and Prof. Alex Wong.
News
- [2025/05] I am joining LUMA AI as a full-time Research scientist!
- [2025/05] I defended my Ph.D. and graduated!
- [2025/02] Three papers are accepted to CVPR 2025! See you in Nashville!
- [2024/11] Check out our new work MultiFoley for creative Foley sound generation!
Research
My research so far has focused on audio-visual learning, multimodal learning (vision, audio, language, touch) and computer vision with self-supervision. Still, I am open to other topics as well.
Work Experience
Adobe Research, May 2024 ~ Nov. 2024
Research Scientist Intern
Host:
Prem Seetharaman,
Bryan Russell,
and
Justin Salamon
Codec Avatars Lab, Pittsburgh, Meta, May 2023 ~ Nov. 2023
Research Scientist Intern
Host:
Israel D. Gebru,
Christian Richardt,
and
Alexander Richard
Publication
(* indicates equal contribution)
Video-Guided Foley Sound Generation with Multimodal Controls
Ziyang Chen,
Prem Seetharaman,
Bryan Russell,
Oriol Nieto,
David Bourgin,
Andrew Owens,
Justin Salamon
CVPR 2025
project page
·
paper
·
bibtex
GPS as a Control Signal for Image Generation
Chao Feng,
Ziyang Chen,
Aleksander Holynski,
Alexei A. Efros,
Andrew Owens
CVPR 2025
project page
·
paper
·
code
·
bibtex
Supervising Sound Localization Using In-the-wild Ego-motion
Anna Min,
Ziyang Chen,
Hang Zhao,
Andrew Owens
CVPR 2025 (Highlight)
Images that Sound: Composing Images and Sounds on a Single Canvas
Ziyang Chen,
Daniel Geng,
Andrew Owens
NeurIPS 2024
CVPR 2024 AI Art Gallery
project page
·
paper
·
code
·
bibtex
Real Acoustic Fields: An Audio-Visual Room Acoustics Dataset and Benchmark
Ziyang Chen,
Israel D. Gebru,
Christian Richardt,
Anurag Kumar,
William Laney,
Andrew Owens,
Alexander Richard
CVPR 2024 (Highlight -- 2.8% accept rate)
project page
·
paper
·
dataset
·
bibtex
Binding Touch to
Everything: Learning Unified Multimodal Tactile Representations
Fengyu Yang*,
Chao Feng*,
Ziyang Chen*,
...... ,
Andrew Owens,
Alex Wong
CVPR 2024
project page
·
paper
·
code
·
bibtex
Sound Localization from Motion: Jointly Learning Sound Direction and Camera Rotation
Ziyang Chen, Shengyi Qian, Andrew Owens
ICCV 2023
project page
·
paper
·
code
·
bibtex
Conditional Generation of Audio from Video via Foley Analogies
Yuexi Du, Ziyang Chen, Justin Salamon, Bryan Russell, Andrew Owens
CVPR 2023
project page
·
paper
·
code
·
bibtex
Self-Supervised Video Forensics by Audio-Visual Anomaly Detection
Chao Feng, Ziyang Chen, Andrew Owens
CVPR 2023 (Highlight -- 2.5% accept rate)
project page
·
paper
·
code
·
bibtex
Sound Localization by Self-Supervised Time Delay Estimation
Ziyang Chen, David F. Fouhey, Andrew Owens
ECCV 2022
project page
·
paper
·
slides
·
code
·
bibtex
Mix and Localize: Localizing Sound Sources in Mixtures
Xixi Hu*, Ziyang Chen*, Andrew Owens
CVPR 2022
project page
·
paper
·
slides
·
code
·
bibtex
Structure from Silence: Learning Scene Structure from Ambient Sound
Ziyang Chen*, Xixi Hu*, Andrew Owens
CoRL 2021 (Oral)
project page
·
paper
·
slides
·
code
·
bibtex
Profession
Workshop Organizer:
- ICCV 2025 Workshop: Generative AI for Audio-Visual Content Creation (Co-organizer)
- CVPR 2025 Workshop: Sight and Sound (Co-organizer)
- CVPR 2024 Workshop: Sight and Sound (Co-organizer)
Reviewer:
- Conference: WACV 2023, CVPR 2023, ICCV 2023, CVPR 2024, SIGGRAPH 2024, ECCV 2024, NeurIPS 2024, CVPR 2025, ICCV 2025
- Journal: IJCV
Teaching Assistant:
- VE312 Digital Integrated Circuits, Fall 2018, UM-SJTU Joint Institute. With Yaping Dan.
- VE311 Electronic Circuits, Summer 2019, UM-SJTU Joint Institute. With Chang-Ching Tu.














