HTTP/2 200
server: GitHub.com
content-type: text/html; charset=utf-8
last-modified: Sat, 18 Oct 2025 03:46:25 GMT
access-control-allow-origin: *
strict-transport-security: max-age=31556952
etag: W/"68f30d91-9a66"
expires: Mon, 29 Dec 2025 10:28:38 GMT
cache-control: max-age=600
content-encoding: gzip
x-proxy-cache: MISS
x-github-request-id: 1D14:3946E9:89A8C5:9AAEEF:6952557D
accept-ranges: bytes
age: 0
date: Mon, 29 Dec 2025 10:18:38 GMT
via: 1.1 varnish
x-served-by: cache-bom-vanm7210086-BOM
x-cache: MISS
x-cache-hits: 0
x-timer: S1767003518.931986,VS0,VE228
vary: Accept-Encoding
x-fastly-request-id: 32b8e58ab2f531e378aa5e597513dd8038f225e0
content-length: 8211
Hao Chen 218 Mengminwei Building
Zijingang Campus, Zhejiang Univserity
Hangzhou, China
Hao Chen is a Reseach Professor in the College of Computer Science & the State Key Laboratory of CAD&CG at Zhejiang University . Before joining ZJU in 2022, he was a senior research at Huawei Noah’s Ark Lab (天才少年计划). He received his Ph.D degree in Adelaide Intelligent Machines Group at the University of Adelaide advised by Professor Chunhua Shen . Before coming to Adelaide he obtained his undergraduate and master degree from Zhejiang University, and had been a researcher in NetEase Inc.
His general research interest lies in foundation models of Computer Vision. His recent research involves:
Welcome any internship! If you want to apply for post-graduate degree, please contact me in advance!
有博后、科研助理、实习交流等岗位!如果想申请硕士、博士研究生,请提前进组实习,有科研产出的实习生将会优先录取!
Oct 10, 2025 Two papers accepted by NeurIPS 2025! Jul 24, 2025 Some recent cool demos released, 3D editing with Tinker , our new mobile manipulation system Odyssey . Jul 24, 2025 Three papers accepted by ICCV 2025! Mar 05, 2025 Six papers accepted by ICLR 2025 (two Spotlights)!
Diception: A generalist diffusion model for visual perceptual tasks
Canyu Zhao, Mingyu Liu, Huanyi Zheng, Muzhi Zhu, Zhiyue Zhao, and 3 more authors
Advances in Neural Information Processing Systems , 2025
Omni-R1: Reinforcement Learning for Omnimodal Reasoning via Two-System Collaboration
Hao Zhong, Muzhi Zhu, Zongze Du, Zheng Huang, Canyu Zhao, and 4 more authors
Advances in Neural Information Processing Systems , 2025
POMATO: Marrying Pointmap Matching with Temporal Motion for Dynamic 3D Reconstruction
Songyan Zhang, Yongtao Ge, Jinyuan Tian, Guangkai Xu, Hao Chen , and 2 more authors
In Proceedings of the IEEE/CVF International Conference on Computer Vision , 2025
POMATO: Marrying Pointmap Matching with Temporal Motion for Dynamic 3D Reconstruction
Songyan Zhang, Yongtao Ge, Jinyuan Tian, Guangkai Xu, Hao Chen , and 2 more authors
In Proceedings of the IEEE/CVF International Conference on Computer Vision , 2025
SegAgent: Exploring Pixel Understanding Capabilities in MLLMs by Imitating Human Annotator Trajectories
Muzhi Zhu, Yuzhuo Tian, Hao Chen , Chunluan Zhou, Qingpei Guo, and 3 more authors
In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , 2025
Framer: Interactive Frame Interpolation
Wen Wang, Qiuyu Wang, Kecheng Zheng, Hao Ouyang, Zhekai Chen, and 4 more authors
In The Thirteenth International Conference on Learning Representations , 2025
Generative Active Learning for Long-tailed Instance Segmentation
Muzhi Zhu, Chengxiang Fan, Hao Chen , Yang Liu, Weian Mao, and 2 more authors
In Forty-First International Conference on Machine Learning , 2024
Recently, large-scale language-image generative models have gained widespread attention and many works have utilized generated data from these models to further enhance the performance of perception tasks. However, not all generated data can positively impact downstream models, and these methods do not thoroughly explore how to better select and utilize generated data. On the other hand, there is still a lack of research oriented towards active learning on generated data. In this paper, we explore how to perform active learning specifically for generated data in the long-tailed instance segmentation task. Subsequently, we propose BSGAL, a new algorithm that estimates the contribution of the current batch-generated data based on gradient cache. BSGAL is meticulously designed to cater for unlimited generated data and complex downstream segmentation tasks. BSGAL outperforms the baseline approach and effectually improves the performance of long-tailed segmentation.
Matcher: Segment Anything with One Shot Using All-Purpose Feature Matching
Yang Liu, Muzhi Zhu, Hengtao Li, Hao Chen , Xinlong Wang, and 1 more author
In The Twelfth International Conference on Learning Representations , 2024
FrozenRecon: Pose-free 3D Scene Reconstruction with Frozen Depth Models
Guangkai Xu, Wei Yin, Hao Chen , Chunhua Shen, Kai Cheng, and 1 more author
In Proceedings of the IEEE/CVF International Conference on Computer Vision , 2023
What Matters When Repurposing Diffusion Models for General Dense Perception Tasks?
Guangkai Xu, Yongtao Ge, Mingyu Liu, Chengxiang Fan, Kangyang Xie, and 3 more authors
In The Thirteenth International Conference on Learning Representations , 2024
Metric3D: Towards Zero-shot Metric 3D Prediction from A Single Image
Wei Yin, Chi Zhang, Hao Chen , Zhipeng Cai, Gang Yu, and 3 more authors
In Proceedings of the IEEE/CVF International Conference on Computer Vision , 2023
Champion in CVPR2023 Monocular Depth Estimation Challenge
Blendmask: Top-down meets bottom-up for instance segmentation
Hao Chen , Kunyang Sun, Zhi Tian, Chunhua Shen, Yongming Huang, and 1 more author
In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , 2020
Abcnet: Real-time scene text spotting with adaptive bezier-curve network
Yuliang Liu, Hao Chen , Chunhua Shen, Tong He, Lianwen Jin, and 1 more author
In proceedings of the IEEE/CVF conference on computer vision and pattern recognition , 2020
FCOS: Fully Convolutional One-Stage Object Detection
Zhi Tian, Chunhua Shen, Hao Chen , and Tong He
In Proceedings of the IEEE/CVF international conference on computer vision , 2019
Fast neural architecture search of compact semantic segmentation models via auxiliary cells
Vladimir Nekrasov, Hao Chen , Chunhua Shen, and Ian Reid
In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , 2019
Feel free to check out some of my old posts and videos.