HTTP/2 200
server: GitHub.com
content-type: text/html; charset=utf-8
last-modified: Mon, 01 Dec 2025 04:53:15 GMT
access-control-allow-origin: *
strict-transport-security: max-age=31556952
etag: W/"692d1f3b-2849"
expires: Sun, 28 Dec 2025 10:10:56 GMT
cache-control: max-age=600
content-encoding: gzip
x-proxy-cache: MISS
x-github-request-id: A74A:2DDCFF:779E9B:86102E:6950FFD8
accept-ranges: bytes
age: 0
date: Sun, 28 Dec 2025 10:00:56 GMT
via: 1.1 varnish
x-served-by: cache-bom-vanm7210035-BOM
x-cache: MISS
x-cache-hits: 0
x-timer: S1766916057.775247,VS0,VE215
vary: Accept-Encoding
x-fastly-request-id: 352f511c6a545d0a6771e2b0481011c5d6cdc7c6
content-length: 2812
|
Wei Chen
PhD Student
Department of Computer Science and Engineering
The Hong Kong University of Science and Technology
Email: csewei.chen AT connect DOT ust DOT hk
|
|
Biography
I am a PhD student in the Department of Computer Science and Engineering at The Hong Kong University of Science and Technology, where I am advised by Prof. Long Chen in the LONG Group.
Prior to this, I obtained my bachelor's degree from the School of Computer Science at Wuhan University, under the supervision of Prof. Yu Wu.
Research Interests
My research interests lie in multi-modal content understanding and generation, which include:
-
Vision-language alignment in Multi-modal Large Language Models (MLLM)
-
Reinforcement Learning from Human Feedback (RLHF) in MLLM
-
Unified multi-modal understanding and generation with mutual enhancement
🔍 I am currently seeking Summer 2026 internships related to Multi-modal Large Language Models (MLLM). Please feel free to contact me if you have any opportunities!
Publications
Relation-R1: Cognitive Chain-of-Thought Guided Reinforcement Learning for Unified Relational Comprehension
Lin Li*, Wei Chen*, Jiahui Li, Kwang-Ting Cheng, Long Chen
AAAI Conference on Artificial Intelligence (
AAAI), 2026
[paper]
[code]
Thyme: Think Beyond Images
Yi-Fan Zhang, Xingyu Lu, Shukang Yin, Chaoyou Fu, Wei Chen, Xiao Hu, Bin Wen, et al.
Decoupling Contrastive Decoding: Robust Hallucination Mitigation in Multimodal Large Language Models
Wei Chen, Xin Yan, Bin Wen, Fan Yang, Tingting Gao, Di Zhang, Long Chen
Neural Information Processing Systems (
NeurIPS), 2025
[paper]
[code]
CoMM: A Coherent Interleaved Image-Text Dataset for Multimodal Understanding and Generation
Wei Chen*, Lin Li*, Yongqi Yang*, Bin Wen, Fan Yang, Tingting Gao, Yu Wu, Long Chen
Computer Vision and Pattern Recognition (
CVPR Highlight), 2025
[paper]
[code]
An Efficient and Effective Transformer Decoder-Based
Framework for Multi-Task Visual Grounding
Wei Chen, Long Chen, Yu Wu
European Conference on Computer Vision (
ECCV), 2024
[paper]
[code]
Internship Experience
Kuaishou Keye-VL Team, Beijing, China
Mar. 2024 – Present
Remote Collaboration
Sep. 2024 – Apr. 2025, Research Intern
Research Intern
Mar. 2024 – Aug. 2024, Multimedia Understanding Group