CARVIEW

MOTORHOMES

Select Language

HTTP/2 200 server: GitHub.com content-type: text/html; charset=utf-8 last-modified: Fri, 11 Jul 2025 03:28:44 GMT access-control-allow-origin: * strict-transport-security: max-age=31556952 etag: W/"687084ec-50ab" expires: Mon, 29 Dec 2025 12:22:08 GMT cache-control: max-age=600 content-encoding: gzip x-proxy-cache: MISS x-github-request-id: 6AC2:2F7ECD:8CAB07:9DE375:69527016 accept-ranges: bytes age: 0 date: Mon, 29 Dec 2025 12:12:08 GMT via: 1.1 varnish x-served-by: cache-bom-vanm7210024-BOM x-cache: MISS x-cache-hits: 0 x-timer: S1767010328.941055,VS0,VE210 vary: Accept-Encoding x-fastly-request-id: 358c77d3c4acebd47642e2d1d40411202a90373d content-length: 5483 Kunchang Li

Kunchang Li

Researcher

Seed, Bytedance
Email: likunchang[at]bytedance.com

Biography [CV]

I am a Researcher at Seed, Bytedance. I obtained my Ph.D. degree from Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences (SIAT) in 2025, under the supervision of Yu Qiao and Yali Wang. I received my B.S. degree from the Software School, Beihang University in 2020. Previously, I was a Research Intern at Megvii, SenseTime, Shanghai AI Laboratory and Bytedance.

Most of my previous research is about video foundation model, including model design, large-scale pretraining, dataset collection and benchamrk evaluation. Currently, I am dedicated to researching unified multimodal models, striving to contribute to the development of more powerful and intriguing AGI.

Research

* refers to the co-first authors. Representative papers are highlighted.
All my works are open-sourced on GitHub, and the full paper list can be found on Google Scholar.

BAGEL: Emerging Properties in Unified Multimodal Pretraining
Chaorui Deng*, Deyao Zhu*, Kunchang Li*, Chenhui Gou*, Feng Li*, Zeyu Wang, Shu Zhong, Weihao Yu, Xiaonan Nie, Ziang Song, Guang Shi, Haoqi Fan*.
Arxiv, 2025.

[Website] [Paper] [Code] [Demo] [Blog]
InternVideo2: Scaling Foundation Models for Multimodal Video Understanding
Yi Wang*, Kunchang Li*, Xinhao Li*, Jiashuo Yu*, Yinan He*, Guo Chen, Baoqi Pei, Rongkun Zheng, Jilan Xu, Zun Wang, Yansong Shi, Tianxiang Jiang, Songze Li, Hongjie Zhang, Yifei Huang, Yu Qiao, Yali Wang, Limin Wang.
European Conference on Computer Vision (ECCV), 2024.

[Paper] [Code]
VideoMamba: State Space Model for Efficient Video Understanding
Kunchang Li, Xinhao Li, Yi Wang, Yinan He, Yali Wang, Limin Wang, Yu Qiao.
European Conference on Computer Vision (ECCV), 2024.

[Paper] [Code] [Demo] [Blog]
MVBench: A Comprehensive Multi-modal Video Understanding Benchmark
Kunchang Li, Yali Wang, Yinan He, Yizhuo Li, Yi Wang, Yi Liu, Zun Wang, Jilan Xu, Guo Chen, Ping Luo, Limin Wang, Yu Qiao.
Computer Vision and Pattern Recognition (CVPR), 2024. (Highlight, Top 3%)

[Paper] [Code] [Demo] [Dataset] [Benchamrk] [Leaderboard] [Video] [Blog]
Vlogger: Make Your Dream A Vlog
Shaobin Zhuang, Kunchang Li, Xinyuan Chen, Yaohui Wang, Ziwei Liu, Yu Qiao, Yali Wang.
Computer Vision and Pattern Recognition (CVPR), 2024.

[Project] [Paper] [Code] [Demo] [Video] [Blog]
InternVid: A Large-scale Video-Text Dataset for Multimodal Understanding and Generation.
Yi Wang*, Yinan He*, Yizhuo Li*, Kunchang Li, Jiashuo Yu, Xin Ma, Xinhao Li, Guo Chen, Xinyuan Chen, Yaohui Wang, Conghui He, Ping Luo, Ziwei Liu, Yali Wang, Limin Wang, Yu Qiao.
International Conference on Learning Representations (ICLR), 2024. (Spotlght, Top 6%)

[Paper] [Code] [Dataset]
UniFormer: Unifying Convolution and Self-attention for Visual Recognition
Kunchang Li*, Yali Wang*, Junhao Zhang, Peng Gao, Guanglu Song, Yu Liu, Hongsheng Li, Yu Qiao.
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2023.

[Paper] [Code] [Demo] [Blog]
VideoChat: Chat-Centric Video Understanding
Kunchang Li*, Yinan He*, Yi Wang*, Yizhuo Li, Wenhai Wang, Ping Luo, Yali Wang, Limin Wang, Yu Qiao.
SCIENCE CHINA Information Sciences (SCIS), 2025.

[Paper] [Code] [Demo] [Blog]
Unmasked Teacher: Towards Training-Efficient Video Foundation Models
Kunchang Li, Yali Wang, Yizhuo Li, Yi Wang, Yinan He, Limin Wang, Yu Qiao.
International Conference on Computer Vision (ICCV), 2023. (Oral, Top 2%)

[Paper] [Code] [Blog]
UniFormerV2: Unlocking the Potential of Image ViTs for Video Understanding
Kunchang Li, Yali Wang, Yinan He, Yizhuo Li, Yi Wang, Limin Wang, Yu Qiao.
International Conference on Computer Vision (ICCV), 2023.

[Paper] [Code] [Blog]
InternVideo: General Video Foundation Models via Generative and Discriminative Learning.
Yi Wang*, Kunchang Li*, Yizhuo Li*, Yinan He*, Bingkun Huang*, Zhiyu Zhao*, Hongjie Zhang*, Jilan Xu, Yi Liu, Zun Wang, Sen Xing, Guo Chen, Junting Pan, Jiashuo Yu, Yali Wang, Limin Wang, Yu Qiao.
Arxiv, 2022.

[Paper] [Code]
You Only Need 90K Parameters to Adapt Light
Ziteng Cui, Kunchang Li, Lin Gu, Shenghan Su, Peng Gao, Zhengkai Jiang, Yu Qiao, Tatsuya Harada.
The British Machine Vision Conference (BMVC), 2022.

[Paper] [Code] [Blog]
PointCLIP: Point Cloud Understanding by CLIP
Renrui Zhang, Ziyu Guo, Wei Zhang, Kunchang Li, Xupeng Miao, Bin Cui, Yu Qiao, Peng Gao, Hongsheng Li.
Computer Vision and Pattern Recognition (CVPR), 2022.

[Paper] [Code]
MorphMLP: An Efficient MLP-Like Backbone for Spatial-Temporal Representation Learning
David Junhao Zhang*, Kunchang Li*, Yali Wang, Yunpeng Chen, Shashwat Chandra, Yu Qiao, Luoqi Liu, Mike Zheng Shou.
European Conference on Computer Vision (ECCV), 2022.

[Paper] [Code]
Self-slimmed Vision Transformer
Zhuofan Zong*, Kunchang Li*, Guanglu Song, Yali Wang, Yu Qiao, Biao Leng, Yu Liu.
European Conference on Computer Vision (ECCV), 2022.

[Paper] [Code]
Tip-Adapter: Training-free CLIP-Adapter for Better Vision-Language Modeling
Renrui Zhang*, Rongyao Fang*, Wei Zhang*, Peng Gao, Kunchang Li, Jifeng Dai, Yu Qiao, Hongsheng Li.
European Conference on Computer Vision (ECCV), 2022.

[Paper] [Code]
UniFormer: Unified Transformer for Efficient Spatiotemporal Representation Learning
Kunchang Li*, Yali Wang*, Junhao Zhang, Peng Gao, Guanglu Song, Yu Liu, Hongsheng Li, Yu Qiao.
International Conference on Learning Representations (ICLR), 2022. (Top 3% of review scores)

[Paper] [Code] [Demo] [Blog]
CT-Net: Channel Tensorization Network for Video Classification
Kunchang Li*, Xianhang Li*, Yali Wang*, Jun Wang, Yu Qiao.
International Conference on Learning Representations (ICLR), 2021.

[Paper] [Code]

Internships

Seed, Bytedance, Beijing, China

Oct. 2024 – Now

Research Intern
OpenGVLab, Shanghai AI Lab, Shanghai, China

Nov. 2021 – Oct. 2024

Research Intern
Advisor: Yali Wang, Limin Wang, Yi Wang
Topic: General Video Foundation Model; Large-scale Pre-training; Multimodal Learning
X-Lab, SenseTime, Beijing, China

Feb. 2021 – Nov. 2021

Research Intern
Advisor: Guanglu Song, Yu Liu
Topic: Efficient Architecture Design; Video Understanding
Megvii, Beijing, China

Oct. 2019 – Jan. 2020

Research Intern
Topic: Model Reproduction for MegEngine

Honors

Excellent Graduate of University of Chinese Academy of Sciences

2025
Zhu Li Yuehua Scholarship

2024
Outstanding Graduate Students in SIAT, UCAS

2023, 2024
National award scholarship

2023
Dean Scholarship of University of Chinese Academy of Sciences

2022, 2023
Merit Student at University of Chinese Academy of Sciences

2021, 2022, 2023
1st Place in Forecasting Challenge (ECCV2022 Ego4D Workshop)

2022
2nd Place in Action Recognition in the Dark Challenge (CVPR2022 UG2+ Workshop)

2022
1st Place in Semantic Segmentation of Remote Sensing Images (CCF BDCI Contest)

2021
Comprehensive Grand Prize of CCF BDCI Contest

2021
Excellent Higher Education Graduate of Beijing Municipality

2020
Grand Prize of Social Work Scholarship, Grand Prize of Study Excellence Scholarship,
Merit Student and Honor student at Beihang University

2017, 2018, 2019

Services

Conference Reviewer: ICLR2023/2024/2025, CVPR2023/2024/2025, ICCV2023/2025, NeurIPS2023/2024, ICML2024, ECCV2024
Journal Reviewer: TPAMI, IJCV, PR, NN, JVCI
Talk: AI Drive 2022, AI Time 2022, AI Time 2023, others (FDU, UCAS, KAUST...)

Original Source | Taken Source