| CARVIEW |
Select Language
HTTP/2 200
server: GitHub.com
content-type: text/html; charset=utf-8
last-modified: Fri, 11 Jul 2025 03:28:44 GMT
access-control-allow-origin: *
strict-transport-security: max-age=31556952
etag: W/"687084ec-50ab"
expires: Mon, 29 Dec 2025 12:22:08 GMT
cache-control: max-age=600
content-encoding: gzip
x-proxy-cache: MISS
x-github-request-id: 6AC2:2F7ECD:8CAB07:9DE375:69527016
accept-ranges: bytes
age: 0
date: Mon, 29 Dec 2025 12:12:08 GMT
via: 1.1 varnish
x-served-by: cache-bom-vanm7210024-BOM
x-cache: MISS
x-cache-hits: 0
x-timer: S1767010328.941055,VS0,VE210
vary: Accept-Encoding
x-fastly-request-id: 358c77d3c4acebd47642e2d1d40411202a90373d
content-length: 5483
Kunchang Li
All my works are open-sourced on GitHub, and the full paper list can be found on Google Scholar.
© Kunchang Li | Last updated: 05/26/2025
Kunchang Li
ResearcherSeed, Bytedance Email: likunchang[at]bytedance.com |
![]() |
Biography [CV]
I am a Researcher at Seed, Bytedance. I obtained my Ph.D. degree from Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences (SIAT) in 2025, under the supervision of Yu Qiao and Yali Wang. I received my B.S. degree from the Software School, Beihang University in 2020. Previously, I was a Research Intern at Megvii, SenseTime, Shanghai AI Laboratory and Bytedance.
Most of my previous research is about video foundation model, including model design, large-scale pretraining, dataset collection and benchamrk evaluation. Currently, I am dedicated to researching unified multimodal models, striving to contribute to the development of more powerful and intriguing AGI.
Research
* refers to the co-first authors. Representative papers are highlighted.All my works are open-sourced on GitHub, and the full paper list can be found on Google Scholar.
-
InternVideo2: Scaling Foundation Models for Multimodal Video Understanding
Yi Wang*, Kunchang Li*, Xinhao Li*, Jiashuo Yu*, Yinan He*, Guo Chen, Baoqi Pei, Rongkun Zheng, Jilan Xu, Zun Wang, Yansong Shi, Tianxiang Jiang, Songze Li, Hongjie Zhang, Yifei Huang, Yu Qiao, Yali Wang, Limin Wang.
European Conference on Computer Vision (ECCV), 2024. -
VideoMamba: State Space Model for Efficient Video Understanding
Kunchang Li, Xinhao Li, Yi Wang, Yinan He, Yali Wang, Limin Wang, Yu Qiao.
European Conference on Computer Vision (ECCV), 2024. -
MVBench: A Comprehensive Multi-modal Video Understanding Benchmark
Kunchang Li, Yali Wang, Yinan He, Yizhuo Li, Yi Wang, Yi Liu, Zun Wang, Jilan Xu, Guo Chen, Ping Luo, Limin Wang, Yu Qiao.
Computer Vision and Pattern Recognition (CVPR), 2024.   (Highlight, Top 3%)[Paper] [Code] [Demo] [Dataset] [Benchamrk] [Leaderboard] [Video] [Blog]
-
Vlogger: Make Your Dream A Vlog
Shaobin Zhuang, Kunchang Li, Xinyuan Chen, Yaohui Wang, Ziwei Liu, Yu Qiao, Yali Wang.
Computer Vision and Pattern Recognition (CVPR), 2024. -
InternVid: A Large-scale Video-Text Dataset for Multimodal Understanding and Generation.
Yi Wang*, Yinan He*, Yizhuo Li*, Kunchang Li, Jiashuo Yu, Xin Ma, Xinhao Li, Guo Chen, Xinyuan Chen, Yaohui Wang, Conghui He, Ping Luo, Ziwei Liu, Yali Wang, Limin Wang, Yu Qiao.
International Conference on Learning Representations (ICLR), 2024.   (Spotlght, Top 6%) -
UniFormer: Unifying Convolution and Self-attention for Visual Recognition
Kunchang Li*, Yali Wang*, Junhao Zhang, Peng Gao, Guanglu Song, Yu Liu, Hongsheng Li, Yu Qiao.
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2023. -
VideoChat: Chat-Centric Video Understanding
Kunchang Li*, Yinan He*, Yi Wang*, Yizhuo Li, Wenhai Wang, Ping Luo, Yali Wang, Limin Wang, Yu Qiao.
SCIENCE CHINA Information Sciences (SCIS), 2025. -
Unmasked Teacher: Towards Training-Efficient Video Foundation Models
Kunchang Li, Yali Wang, Yizhuo Li, Yi Wang, Yinan He, Limin Wang, Yu Qiao.
International Conference on Computer Vision (ICCV), 2023.   (Oral, Top 2%) -
UniFormerV2: Unlocking the Potential of Image ViTs for Video Understanding
Kunchang Li, Yali Wang, Yinan He, Yizhuo Li, Yi Wang, Limin Wang, Yu Qiao.
International Conference on Computer Vision (ICCV), 2023. -
InternVideo: General Video Foundation Models via Generative and Discriminative Learning.
Yi Wang*, Kunchang Li*, Yizhuo Li*, Yinan He*, Bingkun Huang*, Zhiyu Zhao*, Hongjie Zhang*, Jilan Xu, Yi Liu, Zun Wang, Sen Xing, Guo Chen, Junting Pan, Jiashuo Yu, Yali Wang, Limin Wang, Yu Qiao.
Arxiv, 2022. -
You Only Need 90K Parameters to Adapt Light
Ziteng Cui, Kunchang Li, Lin Gu, Shenghan Su, Peng Gao, Zhengkai Jiang, Yu Qiao, Tatsuya Harada.
The British Machine Vision Conference (BMVC), 2022. -
PointCLIP: Point Cloud Understanding by CLIP
Renrui Zhang, Ziyu Guo, Wei Zhang, Kunchang Li, Xupeng Miao, Bin Cui, Yu Qiao, Peng Gao, Hongsheng Li.
Computer Vision and Pattern Recognition (CVPR), 2022. -
MorphMLP: An Efficient MLP-Like Backbone for Spatial-Temporal Representation Learning
David Junhao Zhang*, Kunchang Li*, Yali Wang, Yunpeng Chen, Shashwat Chandra, Yu Qiao, Luoqi Liu, Mike Zheng Shou.
European Conference on Computer Vision (ECCV), 2022. -
Self-slimmed Vision Transformer
Zhuofan Zong*, Kunchang Li*, Guanglu Song, Yali Wang, Yu Qiao, Biao Leng, Yu Liu.
European Conference on Computer Vision (ECCV), 2022. -
Tip-Adapter: Training-free CLIP-Adapter for Better Vision-Language Modeling
Renrui Zhang*, Rongyao Fang*, Wei Zhang*, Peng Gao, Kunchang Li, Jifeng Dai, Yu Qiao, Hongsheng Li.
European Conference on Computer Vision (ECCV), 2022. -
UniFormer: Unified Transformer for Efficient Spatiotemporal Representation Learning
Kunchang Li*, Yali Wang*, Junhao Zhang, Peng Gao, Guanglu Song, Yu Liu, Hongsheng Li, Yu Qiao.
International Conference on Learning Representations (ICLR), 2022.   (Top 3% of review scores) -
CT-Net: Channel Tensorization Network for Video Classification
Kunchang Li*, Xianhang Li*, Yali Wang*, Jun Wang, Yu Qiao.
International Conference on Learning Representations (ICLR), 2021.
Internships
-
Seed, Bytedance, Beijing, ChinaOct. 2024 – Now
Research Intern
-
OpenGVLab, Shanghai AI Lab, Shanghai, ChinaNov. 2021 – Oct. 2024
Research Intern
Advisor: Yali Wang, Limin Wang, Yi Wang
Topic: General Video Foundation Model; Large-scale Pre-training; Multimodal Learning
-
X-Lab, SenseTime, Beijing, ChinaFeb. 2021 – Nov. 2021
Research Intern
Advisor: Guanglu Song, Yu Liu
Topic: Efficient Architecture Design; Video Understanding
-
Megvii, Beijing, ChinaOct. 2019 – Jan. 2020
Research Intern
Topic: Model Reproduction for MegEngine
Honors
-
Excellent Graduate of University of Chinese Academy of Sciences2025
-
Zhu Li Yuehua Scholarship2024
-
Outstanding Graduate Students in SIAT, UCAS2023, 2024
-
National award scholarship2023
-
Dean Scholarship of University of Chinese Academy of Sciences2022, 2023
-
Merit Student at University of Chinese Academy of Sciences2021, 2022, 2023
-
1st Place in Forecasting Challenge (ECCV2022 Ego4D Workshop)2022
-
2nd Place in Action Recognition in the Dark Challenge (CVPR2022 UG2+ Workshop)2022
-
1st Place in Semantic Segmentation of Remote Sensing Images (CCF BDCI Contest)2021
-
Comprehensive Grand Prize of CCF BDCI Contest2021
-
Excellent Higher Education Graduate of Beijing Municipality2020
-
Grand Prize of Social Work Scholarship, Grand Prize of Study Excellence Scholarship,
Merit Student and Honor student at Beihang University2017, 2018, 2019
Services
- Conference Reviewer: ICLR2023/2024/2025, CVPR2023/2024/2025, ICCV2023/2025, NeurIPS2023/2024, ICML2024, ECCV2024
- Journal Reviewer: TPAMI, IJCV, PR, NN, JVCI
- Talk: AI Drive 2022, AI Time 2022, AI Time 2023, others (FDU, UCAS, KAUST...)






