| CARVIEW |
Select Language
HTTP/2 200
server: GitHub.com
content-type: text/html; charset=utf-8
last-modified: Fri, 27 Dec 2024 08:39:10 GMT
access-control-allow-origin: *
strict-transport-security: max-age=31556952
etag: W/"676e67ae-6c66"
expires: Mon, 29 Dec 2025 21:06:55 GMT
cache-control: max-age=600
content-encoding: gzip
x-proxy-cache: MISS
x-github-request-id: 828C:2685F2:93B327:A5D9DA:6952EB16
accept-ranges: bytes
age: 0
date: Mon, 29 Dec 2025 20:56:55 GMT
via: 1.1 varnish
x-served-by: cache-bom-vanm7210070-BOM
x-cache: MISS
x-cache-hits: 0
x-timer: S1767041815.107397,VS0,VE209
vary: Accept-Encoding
x-fastly-request-id: 85646119f495ce9a76cc930f0db6083adf9176cb
content-length: 5513
Haoxuan You @ Columbia University
am a fifth-year Computer Science PhD student received my PhD degree at the Columbia University, advised by Prof. Shih-Fu Chang and
co-advised by Prof. Kai-Wei Chang
from UCLA.
Previously I received a Bachelor dregree from Xidian University in 2018. Then I spent a gap year working as a Research
Assistant in Tsinghua University advised by Prof.
Yue Gao,
in the middle of which, I visited MCL lab in
University of Southern California, advised by Prof. C.-C. Jay Kuo.
📷 credit to: my wife Xiaohui
Haoxuan You (有昊轩)
I am a Research Scientist at Apple AI/ML Foundation Models. I work on foundamental problems in vision-and-language, with an emphasis on scalable, unified and generalizable models/methods.
IIn my Ph.D. study, I am fortunate to intern at Microsoft Azure Cognitive Services Research (Mentor: Luowei Zhou), Google Research (Mentor: Jiahui Yu, Mandy Guo and Jason Baldridge), and Apple AI/ML (Mentor: Liangliang Cao, Zhe Gan and Yinfei Yang).
Selected Publications (Full List)
MM1.5: Methods, Analysis & Insights from Multimodal LLM Fine-tuning
A significant and comprehensive upgrade of MM1
arxiv 2024
Project Page / Paper
A significant and comprehensive upgrade of MM1
arxiv 2024
Project Page / Paper
Ferret-v2: An Improved Baseline for
Referring and Grounding with Large Language Models
Haotian Zhang*, Haoxuan You*, Philipp Dufter, Bowen Zhang, Chen Chen, Hong-You Chen, Tsu-Jui Fu, William Wang, Shih-Fu Chang, Yinfei Yang
COLM 2024
Project Page / Paper
Haotian Zhang*, Haoxuan You*, Philipp Dufter, Bowen Zhang, Chen Chen, Hong-You Chen, Tsu-Jui Fu, William Wang, Shih-Fu Chang, Yinfei Yang
COLM 2024
Project Page / Paper
Ferret: Refer
and Ground Anything Anywhere at Any Granularity
Haoxuan You*, Haotian Zhang*, Zhe Gan, Xianzhi Du, Bowen Zhang, Zirui Wang, Liangliang Cao, Shih-Fu Chang, Yinfei Yang
ICLR 2024, Spotlight (5% Acceptance Rate)
Project Page / Paper / Code
Haoxuan You*, Haotian Zhang*, Zhe Gan, Xianzhi Du, Bowen Zhang, Zirui Wang, Liangliang Cao, Shih-Fu Chang, Yinfei Yang
ICLR 2024, Spotlight (5% Acceptance Rate)
Project Page / Paper / Code

CoBIT: A
Contrastive Bi-directional Image-Text Generation Model
Haoxuan You, Mandy Guo, Zhecan Wang, Kai-Wei Chang, Jason Baldridge, Jiahui Yu
ICLR 2024
Project Page / Paper
Haoxuan You, Mandy Guo, Zhecan Wang, Kai-Wei Chang, Jason Baldridge, Jiahui Yu
ICLR 2024
Project Page / Paper

IdealGPT:
Iteratively Decomposing Vision and Language Reasoning via Large Language Models
Haoxuan You*, Rui Sun*, Zhecan Wang*, Long Chen, Gengyu Wang, Hammad Ayyubi, Kai-Wei Chang, Shih-Fu Chang
Empirical Methods in Natural Language Processing - Findings (EMNLP-Findings), 2023
Project Page / Paper / Code
Haoxuan You*, Rui Sun*, Zhecan Wang*, Long Chen, Gengyu Wang, Hammad Ayyubi, Kai-Wei Chang, Shih-Fu Chang
Empirical Methods in Natural Language Processing - Findings (EMNLP-Findings), 2023
Project Page / Paper / Code

Find Someone
Who: Visual Commonsense Understanding in Human-Centric Grounding
Haoxuan You, Rui Sun*, Zhecan Wang*, Kai-Wei Chang, Shih-Fu Chang
Empirical Methods in Natural Language Processing - Findings (EMNLP-Findings), 2022
Project Page / Paper / Code
Haoxuan You, Rui Sun*, Zhecan Wang*, Kai-Wei Chang, Shih-Fu Chang
Empirical Methods in Natural Language Processing - Findings (EMNLP-Findings), 2022
Project Page / Paper / Code

Learning Visual
Representation from Modality-Shared Contrastive Language-Image Pre-training
Haoxuan You*, Luowei Zhou*, Bin Xiao*, Noel Codella*, Yu Cheng, Ruochen Xu, Shih-Fu Chang, Lu Yuan
Proc. of the European Conf. on Computer Vision (ECCV), 2022
Project Page / Paper / Code
Haoxuan You*, Luowei Zhou*, Bin Xiao*, Noel Codella*, Yu Cheng, Ruochen Xu, Shih-Fu Chang, Lu Yuan
Proc. of the European Conf. on Computer Vision (ECCV), 2022
Project Page / Paper / Code

SGEITL: Scene
Graph Enhanced Image-Text Learning for Visual Commonsense Reasoning
Zhecan Wang*, Haoxuan You*, Liunian Li, Alireza Zareian, Suji Park, Yiqing Liang, Kai-Wei Chang, Shih-Fu Chang
Proc. of the AAAI Conference on Artificial Intelligence (AAAI), 2022
Project Page / Paper
Zhecan Wang*, Haoxuan You*, Liunian Li, Alireza Zareian, Suji Park, Yiqing Liang, Kai-Wei Chang, Shih-Fu Chang
Proc. of the AAAI Conference on Artificial Intelligence (AAAI), 2022
Project Page / Paper

Rethinking network design and local geometry in point cloud: A simple
residual MLP framework
Xu Ma, Can Qin, Haoxuan You, Haoxi Ran, Yun Fu
The International Conference on Learning Representations (ICLR), 2022
Project Page / Paper / Code
Xu Ma, Can Qin, Haoxuan You, Haoxi Ran, Yun Fu
The International Conference on Learning Representations (ICLR), 2022
Project Page / Paper / Code

Unsupervised
vision-and-language pre-training without parallel images and captions
Liunian Li, Haoxuan You*, Zhecan Wang*, Alireza Zareian, Shih-Fu Chang, Kai-Wei Chang
Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL), 2021
Project Page / Paper / Code
Liunian Li, Haoxuan You*, Zhecan Wang*, Alireza Zareian, Shih-Fu Chang, Kai-Wei Chang
Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL), 2021
Project Page / Paper / Code

Learning Visual
Commonsense for Robust Scene Graph Generation
Alireza Zareian*, Zhecan Wang*, Haoxuan You*, Shih-Fu Chang
Proc. of the European Conf. on Computer Vision (ECCV), 2020
Project Page / Paper / Code
Alireza Zareian*, Zhecan Wang*, Haoxuan You*, Shih-Fu Chang
Proc. of the European Conf. on Computer Vision (ECCV), 2020
Project Page / Paper / Code

PointHop: An
Explainable Machine Learning Method for Point Cloud Classification
Min Zhang, Haoxuan You*, Pranav Kadam, Shan Liu, C-C Kuo (*Corresponding Author)
IEEE Transactions on Multimedia, 2020
Project Page / Paper / Code
Min Zhang, Haoxuan You*, Pranav Kadam, Shan Liu, C-C Kuo (*Corresponding Author)
IEEE Transactions on Multimedia, 2020
Project Page / Paper / Code

PointDAN: A
Multi-Scale 3D Domain Adaption Network for Point Cloud Representation
Can Qin*, Haoxuan You*, Lichen Wang, C-C Kuo, Yun Fu
Advances in Neural Information Processing Systems (NeurIPS), 2019
Project Page / Paper / Code
Can Qin*, Haoxuan You*, Lichen Wang, C-C Kuo, Yun Fu
Advances in Neural Information Processing Systems (NeurIPS), 2019
Project Page / Paper / Code

PVRNet:
Point-View Relation Neural Network for 3D Shape Recognition
Haoxuan You, Yifan Feng, Xibin Zhao, Changqing Zou, Rongrong Ji, Yue Gao
Proc. of the AAAI Conference on Artificial Intelligence (AAAI), 2019
Project Page / Paper / Code
Haoxuan You, Yifan Feng, Xibin Zhao, Changqing Zou, Rongrong Ji, Yue Gao
Proc. of the AAAI Conference on Artificial Intelligence (AAAI), 2019
Project Page / Paper / Code

Hypergraph
Neural Networks
Yifan Feng, Haoxuan You, Zizhao Zhang, Rongrong Ji, Yue Gao
Proc. of the AAAI Conference on Artificial Intelligence (AAAI), 2019
Project Page / Paper / Code
Yifan Feng, Haoxuan You, Zizhao Zhang, Rongrong Ji, Yue Gao
Proc. of the AAAI Conference on Artificial Intelligence (AAAI), 2019
Project Page / Paper / Code

PVNet: A Joint
Convolutional Network of Point Cloud and Multi-View for 3D Shape Recognition
Haoxuan You, Yifan Feng, Rongrong Ji, Yue Gao
Proc. of ACM international conference on Multimedia (ACM-MM), 2018
Project Page / Paper
Haoxuan You, Yifan Feng, Rongrong Ji, Yue Gao
Proc. of ACM international conference on Multimedia (ACM-MM), 2018
Project Page / Paper
Website template borrowed from Michael Niemeyer.