HTTP/2 200
server: GitHub.com
content-type: text/html; charset=utf-8
last-modified: Sat, 20 Dec 2025 21:12:43 GMT
access-control-allow-origin: *
strict-transport-security: max-age=31556952
etag: W/"6947114b-5861"
expires: Mon, 29 Dec 2025 15:24:19 GMT
cache-control: max-age=600
content-encoding: gzip
x-proxy-cache: MISS
x-github-request-id: 3F50:3157C7:8EAA5F:A03578:69529ACB
accept-ranges: bytes
age: 0
date: Mon, 29 Dec 2025 15:14:19 GMT
via: 1.1 varnish
x-served-by: cache-bom-vanm7210066-BOM
x-cache: MISS
x-cache-hits: 0
x-timer: S1767021259.462663,VS0,VE203
vary: Accept-Encoding
x-fastly-request-id: 9e926000e65436b9def2bcda9d4698dc45519aae
content-length: 6396
Zixian Ma
Ph.D. Student @ University of Washington
zixianma [at] cs.washington.edu
Hi! I am a PhD student (2023 - ?) at the University of Washington in Computer Science and Engineering advised by Prof. Ranjay Krishna and Prof. Dan Weld . Between Aug 2022 and May 2023, I interned at Google Research and worked at Meta. Before that, I graduated from Stanford University with B.S. (Honors) and M.S. degrees in Computer Science, where I did
research in the Stanford Vision
and Learning Lab and HCI group
under the supervision of the wonderful
Prof. Fei-Fei Li and
Prof. Michael Bernstein .
My current research interests lie broadly in ML, CV and HCI, especially in multi-modal agents, vision-language models, and human-AI interaction and collaboration. My research is supported by the Qualcomm Innovation Fellowship .
Selected publications
Multi-modal Models
Molmo2: Open Weights and Data for Vision-Language Models with Video Understanding and Grounding
Christopher Clark*♥ , Jieyu Zhang*♥ , Zixian Ma* ♥ , Jae Sung Park♥ , Mohammadreza Salehi♥ , Rohun Tripathi♥ , Sangho Lee♥ , Jason Ren, Chris Dongjoo Kim, Yinuo Yang, Vincent Shao, Yue Yang, Weikai Huang, Ziqi Gao, Taira Anderson, Jianrui Zhang, Jitesh Jain, George Stoica, Winston Han, Ali Farhadi, Ranjay Krishna.
[Blog]
[Report]
[Models]
[Data]
[Playground]
LATTE: Learning to Think with Vision Specialists
Zixian Ma , Jianguo Zhang, Zhiwei Liu, Jieyu Zhang, Juntao Tan, Manli Shu, Juan Carlos Niebles, Shelby Heinecke, Huan Wang, Caiming Xiong, Ranjay Krishna, Silvio Savarese
EMNLP 2025 Oral Presentation | SynData4CV workshop @ CVPR 2025
[Website]
[PDF]
[Code]
Synthetic Visual Genome
Jae Sung Park, Zixian Ma , Linjie Li, Chenhao Zheng, Cheng-Yu Hsieh, Ximing Lu, Khyathi Chandu, Quan Kong, Norimasa Kobori, Ali Farhadi, Yejin Choi, Ranjay Krishna
CVPR 2025
[Website]
[PDF]
[Code]
Task Me Anything
Jieyu Zhang, Weikai Huang*, Zixian Ma* , Oscar Michel, Dong He, Tanmay Gupta, Wei-Chiu Ma, Ali Farhadi, Aniruddha Kembhavi, Ranjay Krishna.
NeurIPS 2024 (Datasets & Benchmarks Track) | Video-Language Models @ NeurIPS 2024 Oral Presentation
[Website]
[PDF]
[Code]
m&m's: A Benchmark to Evaluate Tool-Use for m ulti-step m ulti-modal Tasks
Zixian Ma , Weikai Huang, Jieyu Zhang, Tanmay Gupta, Ranjay Krishna
ECCV 2024 | SynData4CV workshop @ CVPR 2024
[Website]
[PDF]
[Code]
SugarCrepe: Fixing Hackable Benchmarks for Vision-Language Compositionality
Cheng-Yu Hsieh*, Jieyu Zhang*, Zixian Ma , Aniruddha Kembhavi, Ranjay Krishna
NeurIPS 2023 (Datasets & Benchmarks Track)
[PDF]
[Code]
CREPE: Can Vision-Language Foundation Models Reason Compositionally?
Zixian Ma* , Jerry Hong*, Mustafa Omer Gul*, Mona Gandhi, Irena Gao, Ranjay Krishna
CVPR 2023 Highlight [top 2.5%]
[PDF]
[Code]
Human-AI Interaction
Rethinking Human Preference Evaluation of LLM Rationales
Ziang Li*, Manasi Ganti*, Zixian Ma* , Helena Vasconcelos, Qijia He, Ranjay Krishna
XLLM-Reason-Plan workshop @ COLM 2025 Best Paper Award (Honorable Mention)
[PDF]
Completion ≠Collaboration: Scaling Collaborative Effort with Agents
Shannon Zejiang Shen, Valerie Chen, Ken Gu, Alexis Ross, Zixian Ma , Alex Gu, Chenglei Si, Jillian Ross, Jocelyn J Shen, Wayne Chi, Andi Peng, Ameet Talwalkar, Tongshuang Wu, David Sontag
ResponsibleFM (Foundation Models) workshop @ NeurIPS 2025 Best Paper Award
[Website] [PDF] [Code]
Model Sketching: Centering Concepts in Early-Stage Machine Learning Model Design
Michelle S. Lam, Zixian Ma , Anne Li, Izequiel Freitas, Dakuo Wang, James A. Landay, Michael S. Bernstein
CHI 2023
[PDF]
ELIGN: Expection Alignment as a Multi-agent Intrinsic Reward
Zixian Ma , Rose Wang, Li Fei-Fei, Michael Bernstein, Ranjay Krishna
NeurIPS 2022
[PDF] [Code]