| CARVIEW |
Ayush Jain
I am a PhD student in Robotics Institute at Carnegie Mellon University working with Prof. Katerina Fragkiadaki and a Visiting Researcher at Meta with Dr. Roozbeh Mottaghi. Previously, I received M.S. in Robotics from Carnegie Mellon University advised by Prof. Katerina Fragkiadaki and B.E. in Computer Science from BITS Pilani, Pilani Campus where I spent one year working with Prof. Katerina Fragkiadaki as a research assistant in Machine Learning Department, Carnegie Mellon University. I am supported in part by CMU Robotics Vision Fellowship and Meta AIM Fellowship.
I am interested in Computer Vision and its application to robot learning. Lately, I have been interested in relationship between various visual modalities: how are 2D, 3D and video modalities related? How can one benefit the other? In my free time, I enjoy playing racquet sports like tennis, squash, and badminton, listening to audiobooks and podcasts, watching and playing cricket, learning music, and spending quiet time at home in a peaceful setting.
Virtual MLC office hours (open to everyone): I set aside two hours every week to help newcomers in the fields of robotics, computer vision, and machine learning. If you're new to these areas and need help with research, grad school applications, or anything else, please feel free to book a slot. Here is a living google doc of resources related to commonly asked questions in these office hours. Additionally, if you need some (small) financial assistance, please feel free to reach out as well with details.Interests
-
Computer Vision
-
Robotics
Education
-
PhD in Robotics, 2023-Present
Carnegie Mellon University
-
Masters in Robotics, 2021-23 (Thesis PDF)
Carnegie Mellon University
-
B.E. in Computer Science, 2017-21
BITS PILANI, Pilani Campus
News
[Dec 2025] I’ll serve as an Area Chair for ECCV 2026.
[Oct 2025] Awarded outstanding reviewer award in ICCV 2025.
[May 2025] I’ll be spending my summer as a research intern at Meta with Fan Zhang and Adam Harley.
[April 2025] Awarded Meta AI Mentorship Fellowship (AY 25-26). I'll be a visiting researcher at Meta AI working with Roozbeh Mottaghi.
[Nov 2024] Awarded CMU Robotics Vision Fellowship (AY 24-25).
[June 2024] I’ll be presenting ODIN as a highlight paper at CVPR 2024. I’ll also be giving an oral talk at Causal and Object-Centric Representations for Robotics workshop at CVPR 2024.
[May 2024] I’ll be spending my summer as a research intern at Meta with Franziska Meier and Sasha Sax.
[Apr 2024] Awarded outstanding reviewer award in CVPR 2024.
[Sep 2023] Awarded outstanding reviewer award in ICCV 2023.
[Apr 2023] I’ll be continuing in Robotics Institute at CMU for my PhD.
[Oct 2022] I’ll be presenting BUTD-DETR at ECCV 2022.
[May 2022] I’ll be spending my summer as a research intern at Apple with Miguel and Navdeep.
[Dec 2021] Our team got selected for Alexa Prize Simbot Challenge. We will be working on advancing teachable embodied household agents.
[Aug 2021] I’ll be starting my Masters in Robotics at Carnegie Mellon University. I will also continue as a Research Assistant with Prof. Katerina Fragkiadaki.
Recent Publications
Grounded Reinforcement Learning for Visual Reasoning
Unifying 2D and 3D Vision-Language Understanding
Locate 3D: Real-World Object Localization via Self-Supervised Learning in 3D
LIFT-GS: Cross-Scene Render-Supervised Distillation for 3D Language Grounding
ODIN: A Single Model for 2D and 3D Segmentation
Diffusion-ES: Gradient-free Planning with Diffusion for Autonomous Driving and Zero-Shot Instruction Following
Energy-based Models as Zero-Shot Planners for Compositional Scene Rearrangement
Bottom Up Top Down Detection Transformers for Language Grounding in Images and Point Clouds
Move to See Better: Towards Self-Improving Embodied Object Detection
AI-enabled Object Detection in Unmanned Aerial Vehicles for Edge Computing Applications
VisDrone-DET2020: The Vision Meets Drone Object Detection in Image Challenge Results
NukeBERT: A Pre-trained language model for Low Resource Nuclear Domain
Experience
Graduate Research Assistant
Carnegie Mellon University
Research Associate
Carnegie Mellon University
Research Assistant
BITS PILANI
Academic Service and Teaching
I reviewed for CVPR 2024 (outstanding reviewer award) ICCV 2023 (outstanding reviewer award), Neurips 2023-24, AAAI 2023, CVPR 2022-23, ECCV 2022-24, BMVC 2020-21, TPAMI 2021-22
I have served as a teaching assistant for the following courses:
- Advanced Computer Vision, Fall 2024, CMU
- Learning for 3D Vision, Spring 2024, CMU
- CS F407 Artificial Intelligence, Fall 2020, BITS Pilani
- CS F464 Machine Learning, Spring 2020, BITS Pilani
- CS F111 Computer Programming, Spring 2019 and Fall 2019, BITS Pilani