| CARVIEW |
Biography
Jiaqi Wang is currently a Research Scientist and Team Leader at Shanghai AI Laboratory, as well as an Adjunct Ph.D. Supervisor at Shanghai Jiao Tong University. He received his Ph.D. in 2021 from the Multimedia Laboratory (MMLab@CUHK) at the Chinese University of Hong Kong , where he was advised by Prof. Dahua Lin and worked closely with Prof. Chen Change Loy. His research focuses on visual perception, vision-language models, and the development of benchmarks and datasets in these areas.
He has published over 60 papers in top-tier journals and conferences, including TPAMI, CVPR, ICCV, ECCV, NeurIPS, ICML, ICLR, and ACL, with more than 16,000 citations and an h-index of 44 on Google Scholar. Two of his papers (MMBench and ShareGPT4v) were recognized among the Most Influential ECCV Papers of 2024, and one paper (MMStar) was listed as one of the Most Influential NeurIPS Papers of 2024. Another work (OmniObject3D) was nominated as a CVPR 2023 Best Paper Candidate. He was also named in the Stanford/Elsevier Top 2% Scientists List in 2024.
As the lead researcher of the InternLM-XComposer multimodal large model series, our project has achieved over 4 million downloads and was once ranked fifth among more than 750,000 models on HuggingFace Trending.
We have opening positions for research interns (undergraduate, master, Ph.D. and gap year students are all welcomed) and full-time researchers. Please drop me an e-mail (wjqdev@gmail.com) if you are interested.
News
- [2025-06] Seven paper accepted to ICCV 2025.
- [2025-05] Four paper accepted to ACL 2025 (2 Main, 2 Findings).
- [2025-02] Two paper accepted to ICML 2025 (One Oral).
- [2025-02] Four paper accepted to CVPR 2025.
- [2024-01] We release InternLM-XComposer-2.5-Reward, a simple yet effective multi-modal reward model for large vision-language models (LVLMs).
- [2025-01] Three paper accepted to ICLR 2025.
- [2024-12] We release InternLM-XComposer-2.5-OmniLive, a comprehensive multimodal system for long-term streaming video and audio interactions.
- [2025-09] One paper accepted to TPAMI.
- [2024-09] Selected into Standford/Elsevier Top 2% Scientisits List 2024
- [2024-09] Two Most Influential ECCV Papers 2024 (Rank 2: MMBENCH; Rank 4: ShareGPT4V)
- [2024-09] Nine paper accepted to NeurIPS 2024 (Four in D&B Track, One Spotlight).
- [2024-07] We release InternLM-XComposer-2.5, a versatile large vision language model supporting long-contextual input and output.
- [2024-07] Three paper accepted to ECCV 2024 (One Oral).
- [2024-04] We are hosting V3Det Challenge jointly with Visual Perception via Learning in an Open World Workshop at CVPR 2024.
- [2024-04] We release InternLM-XComposer2-4KHD, a pioneering large vision-language model handling resolutions from 336 pixels to 4K HD.
- [2024-02] Four paper accepted to CVPR 2024 (Two Spotlights).
- [2024-01] We release InternLM-XComposer2, a groundbreaking vision-language large model excelling in free-form text-image composition and comprehension.
- [2023-09] We release InternLM-XComposer, a large foundation model for advanced text-image comprehension and composition.
- [2023-09] One paper accepted to AAAI 2024.
- [2023-07] One paper accepted to Siggraph Asia 2023.
- [2023-07] One paper accepted to ICCV 2023. (One Oral)
- [2023-02] One paper accepted to ICML 2023.
- [2023-02] Four papers accepted to CVPR 2023. (One Award Candidate)
- [2023-01] One papers accepted to ICLR 2023. (One Spotlight)
- [2022-10] Two papers accepted to AAAI 2023. (One Oral)
- [2022-07] One paper accepted to NeurIPS 2022.
- [2022-02] One paper accepted to CVPR 2022.
- [2021-11] Received his Ph.D. from MMLab and Join Shanghai AI Laboratory
- [2021-10] One paper accepted to TIP.
- [2021-09] One paper accepted to NeurIPS 2021.
- [2021-04] One paper accepted to TPAMI.
- [2021-03] One paper accepted to CVPR 2021.
- [2020-07] One paper accepted to ECCV 2020 (One Spotlight).
- [2019-11] 1st prize in COCO 2019 Object Detection Challenge (without external data).
(Team Leader, Team: MMDet) - [2019-07] One paper accepted to ICCV 2019 (One Oral).
- [2019-02] Two papers accepted to CVPR 2019.
- [2018-09] MMDetection is released. See Technical Report.
- [2018-09] 1st prize in COCO 2018 Object Detection Challenge. (Team: MMDet)
- [2018-02] One paper accepted to CVPR 2018.
- [2017-08] Awarded Hong Kong Ph.D. Fellowship Scheme (HKPFS) and Join MMLab.
Publications
➡️ Full List
Projects
Over 4 million downloads
Once ranked 5th among over 750,000 models on HuggingFace Trending
InternLM-XComposer-2.5-Reward A Simple Yet Effective Multi-Modal Reward Model
InternLM-XComposer-2.5-OmniLive A Specialized Generalist Multimodal System for Streaming Video and Audio Interactions
InternLM-XComposer-2.5 A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output
InternLM-XComposer2-4KHD A Pioneering Large Vision-Language Model Handling Resolutions from 336 Pixels to 4K HD
InternLM-XComposer2 Mastering Free-form Text-Image Composition and Comprehension in Vision-Language Large Models
InternLM-XComposer A Vision-Language Large Model for Advanced Text-image Comprehension and Composition
Honors and Awards
- Standford/Elsevier Top 2% Scientisits List, 2024
- Two Most Influential ECCV Papers 2024 (MMBench, ShareGPT4V)
- CVPR 2023 Award Candidate
- 1st prize in COCO Object Detection Challenge (without external data) (Team Leader, Team: MMDet), 2019
- 1st prize in COCO Object Detection Challenge (Team: MMDet), 2018
- Hong Kong Ph.D. Fellowship Scheme (HKPFS), 2017
- National Scholarship, 2016
- National Scholarship, 2015
- National Scholarship, 2014