| CARVIEW |
I’m interested in how we can build reliable multimodal AI systems that can discover and compose the right first principles.
SELECTED RESEARCH
🗞️News: Get started with computer-use agents using OpenApps, our new Python-based research environment. Checkout our 📣 release, 📃 paper, and 🎬 video tutorial.
🗞️News: I'll be at NeurIPS 2025 sharing some of our latest work: Common-O (a new multimodal reasoning benchmark), AbstentionBench (measuring LLMs' ability to recognize when they don't know), Verbatim Memorization in LLMs, and a study of JEPA architectures (spotlight award). Reach out if you'll be in San Diego!
AbstentionBench: Reasoning LLMs Fail on Unanswerable Questions
We evaluate LLMs’ capacity for abstention: the skill of knowing when NOT to answer! We find reasoning LLMs struggle with unanswerable questions and hallucinate. NeurIPS, 2025
> paper + code +data
cited in OpenAI's GPT-5 technical report & used by UK Security Institute’s Inspect Evals
The Factorization Curse: Which Tokens You Predict Underlie the Reversal Curse and More
We show training transformers to predict multiple tokens ahead and back (instead of just the single next token) improves models' ability to retrieve knowledge. NeurIPS, 2024
> paper
In a followup, we find the same objective without tweaks can improve transformer's ability to plan in maze navigation!
\(\mathbb{X}\)-Sample Contrastive Loss: Improving Contrastive Learning with Sample Similarity Graphs
We propose a graph-based contrastive loss that explicitly encodes relationships across samples during training. ICLR, 2025
> paper
Does Progress On Object Recognition Benchmarks Improve Real-World Generalization?
progress on standard benchmarks fails to improve geographic disparities in today's best models, ICLR 2024.
> paper
Shortcuts Come in Multiples Where Mitigating One Amplifies Others
a method for and study of how deep learning techniques cope with multiple shortcuts, CVPR 2023.
> paper + code
ImageNet-X: Understanding Model Mistakes with Factor of Variation Annotations
we find surprisingly similar strengths and vulnerabilities across more than 2,200 deep learning models, ICLR Spotlight 2023.
> paper + website
Global Explanations for Neural Networks
Mapping the Landscape of Predictions, ACM AAAI 2019
> paper + open source library + blog post
A Cookbook of Self-Supervised Learning
co-author along with Randall Balestriero , Yann LeCun, and many others
Talks
Self Supervised Learning: The Final Frontier of AI at the Simons Flatiron Institute (April
2025)
Lightning
Talk on Latent Space Prediction
organized by Randall Balestriero, Yann LeCun, Alberto Bieti, and Shirley Ho
NeurIPS Self-Supervised Learning Workshop Oral (Dec 2024)
Occam's Razor:
What's sufficient for learning good self-supervised representations?
Brown University talk on Robust Representation Learning (2024)
From Vision to Multimodal Self-Supervised Models
ICML Tutorial on Self-Superivsed Learning (2023) (400+ researchers
attended)
From Research Advances to Best Practices (slides and
recording)
Georgia Tech's Deep Learning Course Instructor (2022) (10k+ online
students)
Lecture on "Feed Forward Neural Networks"
PyCon US 2020 (Python Conference)
Talk on
"Machine Learning on Encrypted Data with CrypTen"
NeurIPS 2018 FEAP Workshop Spotlight Talk (Dec 2018)
"Towards Explainable Deep Learning for Credit
Lending"
New York Python Meetup (Dec 2018)
Data Science Talk: "
Explaining
Deep
Learning
Models"
Applied Machine Learning Tom Tom Conference (April 2018)
"Explainable AI: Key Techniques and Societal
Implications"
George Washington University, Data Driven Conference (Dec 2017)
"Understanding the Predictions of Deep Neural Networks"
NYC Data Wranglers Meetup (Aug 2016)
Data Science in Practice: "Building
a
Graph-Based Search Engine"
Advising & Mentorship
Research Internship Advising:
Ouail Kitouni (MIT PhD, now
Anthropic
Research Scientist),
Mazda Moayeri (U
Maryland PhD
student),
Cian Eastwood (now Senior
Research
Scientist at Valence Labs), Karsten
Roth (PhD
Student Google DeepMind/Tubingen).
AI Residents:
Megan Richards (incoming
NYU PhD
student
with prof
Kyunghyun Cho),
Haider Al-Tahan (incoming
Georgia
Tech PhD student)
Industry PhD Advisor:
Polina Kirichenko
(NYU PhD, now Research Scientist at FAIR, Meta AI).
Courses Taught at the University of Vermont
Calculus I 71 eager minds,
Calculus II 38 étudiants, and
College Algebra 42 estudiantes.