Carview!

CARVIEW

MOTORHOMES

Select Language

HTTP/2 200 server: GitHub.com content-type: text/html; charset=utf-8 last-modified: Sat, 24 Aug 2024 20:06:25 GMT access-control-allow-origin: * strict-transport-security: max-age=31556952 etag: W/"66ca3d41-2a61" expires: Sun, 28 Dec 2025 15:02:35 GMT cache-control: max-age=600 content-encoding: gzip x-proxy-cache: MISS x-github-request-id: C64E:328FD3:7BFDCB:8AF38F:69514433 accept-ranges: bytes age: 0 date: Sun, 28 Dec 2025 14:52:35 GMT via: 1.1 varnish x-served-by: cache-bom-vanm7210028-BOM x-cache: MISS x-cache-hits: 0 x-timer: S1766933556.791766,VS0,VE206 vary: Accept-Encoding x-fastly-request-id: b0eb94179f5dd1159f9a1917041e13282f03110e content-length: 3134 Chun Feng

Chun Feng

I am a senior undergraduate student at the University of Science and Technology of China.

I used to be a summer intern working at Stanford University on 3D object grounding.

I am also fortunate to work with Prof. Tat-Seng Chua at the National University of Singapore on vision and language.

Email: fengchun3364@mail.ustc.edu.cn

Research

I'm broadly interested in video and 3D vision, with the goal of making the agents perceive the physical world and reason based on its perception. I'm also excited to explore other interesting topics (such as generative models).

	Naturally Supervised 3D Visual Grounding with Language-Regularized Concept Learners Chun Feng, Joy Hsu, Weiyu Liu, Jiajun Wu CVPR 2024 PDF / code / project page 3D visual grounding is a challenging task that often requires direct and dense supervision, notably the semantic label for each object in the scene. In this paper, we instead study the naturally supervised setting that learns from only 3D scene and QA pairs, where prior works underperform. We propose the Language-Regularized Concept Learner (LARC), which uses constraints from language as regularization to significantly improve the accuracy of neuro-symbolic concept learners in the naturally supervised setting.
	Discovering Spatio-Temporal Rationales for Video Question Answering Yicong Li, Junbin Xiao, Chun Feng, Xiang Wang, Tat-Seng Chua ICCV 2023 arXiv / code We propose TranSTR that features a spatio-temporal rationalization (STR) module together with a more reasonable candidate answer modeling strategy. The answer modeling strategy is independently verified to be effective in boosting other existing VideoQA models.
	Redundancy-aware Transformer for Video Question Answering Yicong Li, Xu Yang, An Zhang, Chunfeng, Xiang Wang, Tat-Seng Chua ACM MM 2023 arXiv We propose RaFormer, a fully transformer-based VideoQA model that avoids neighboring-frame redundancy by highlighting object-level change in adjacent frames and the out-of-neighborhood message passing at frame-level. In addition, it also handles the cross-modal redundancy via a novel adaptive sampling module.

Last updated: Feb 27, 2024

Original Source | Taken Source