Portrait of Aadarsh Sahoo

About

I am a Ph.D. student at Caltech advised by Prof. Georgia Gkioxari and Prof. Pietro Perona.

Previously, I was a student researcher at the MIT–IBM Watson AI Lab, pretraining large language models and doing research on vision–language models with Dr. Rameswar Panda, Dr. Rogerio Feris, and Prof. Yoon Kim. I completed a dual degree (B.Tech + M.Tech) at IIT Kharagpur, where I worked in the Computer Vision and Intelligence Research Lab under Prof. Abir Das.

In Summer 2021, I worked with Prof. Kate Saenko (Boston University) and Prof. Trevor Darrell (UC Berkeley) as a research intern for the DARPA LwLL project.

Research

My research focuses on building multimodal foundation models that perceive, reason, and interact in 3D environments. I am particularly interested in developing unified representations that bridge geometry, vision, and language-guided systems that can understand spatial relationships, engage in grounded conversations, and perform complex reasoning tasks. My work spans 3D tokenization for sequential modeling, conversational visual understanding, and parameter-efficient adaptation of large models. Ultimately, I aim to advance embodied AI systems that can robustly interpret and act within real-world environments to benefit society.

News

  • Preprint “Aligning Text, Images, and 3D Structure Token-by-Token (Kyvo)” released on arXiv (project page).
  • “XPL: A Cross-Model framework for Semi-Supervised Prompt Learning in VLMs” accepted by TMLR.
  • Attended ICVSS 2024 in Sicily.
  • Recognized as an Outstanding Reviewer for ECCV 2024.
  • Named an Outstanding Reviewer for CVPR 2024.
  • Started my PhD at Caltech.
  • I have been selected as a Kortschak Scholar at Caltech!
  • Paper on Anytime Domain Adaptation accepted at ICLR 2023.
  • Select, Label, and Mix (SLM) received the Best Paper Honorable Mention at WACV 2023.
  • Paper on Partial Domain Adaptation accepted at WACV 2023.
  • Started as a student researcher at the MIT–IBM Watson AI Lab in Cambridge, MA.
  • Extended Abstract on Partial Domain Adaptation accepted at the NeurIPS DistShift Workshop 2021.
  • Paper on Domain Adaptation in Action Recognition accepted at NeurIPS 2021.
  • Volunteer at the CVPR 2021 workshop Dynamic Neural Networks Meets Computer Vision (DNetCV).
  • Started internship at UC Berkeley and Boston University.
  • Paper accepted at the ECCV 2020 Workshop on Imbalance Problems in Computer Vision (IPCV).
  • Joined CVIR, IIT Kharagpur as an undergraduate researcher in Computer Vision.
  • Selected for Computer Science and Engineering major (<1% acceptance).

Publications

AnyDA: Anytime Domain Adaptation teaser

AnyDA: Anytime Domain Adaptation

Omprakash Chakraborty, Aadarsh Sahoo, Rameswar Panda, Abir Das — ICLR, 2023

We propose a domain-alignment approach with switchable depth, width, and input resolution to realize accuracy–efficiency trade-offs under different constraints.

Services

  • Reviewer: CVPR (Outstanding Reviewer 2024), ECCV (Outstanding Reviewer 2024), NeurIPS, ICLR, ICCV, ICML, WACV, TPAMI.
  • Student Volunteer & Reviewer: DNetCV @ CVPR 2022.
  • Student Volunteer & Reviewer: DNetCV @ CVPR 2021.

Contact

Email: aadarsh.sahoo.99@gmail.com

Profiles: Google Scholar · LinkedIn · GitHub · X/Twitter