| CARVIEW |
About me
I am an MS Robotics student at the School of Interactive Computing, Georgia Tech. I currently conduct research on Cross-Embodiment Learning for Robot Manipulation from Human-Play data at the Robot Learning and Reasoning (RL2) lab with Dr. Danfei Xu . Presently, I am working on extending our recent work, EgoMimic (CoRL 2024) for diverse multi-task settings and various axes of generalization (behavior, scene), exploring Vision-Language Models (VLMs) and large-scale embodied human datasets. My interests broadly lie at the intersection of Robotics, Computer Vision and Deep Learning.
UPDATE: I am actively seeking full-time opportunities in AI/Robotics starting May 2025 — feel free to reach out if there's a fit!
I spent summer 2024 interning at Honda Research Institute, USA, working on Scene Understanding for Autonomous Driving at Intersections. Before Georgia Tech, I explored open-source software development as a Google Summer of Code'23 contributor at Unify AI, with an aim to optimize SLAM for real-world deployment in Robotics applications. Prior to this, I was a Project Associate at the Robotics Research Centre (RRC), where I worked on Scene Understanding for Autonomous Driving in Adverse Weather Conditions
(affiliated with Queensland University of Technology (QUT)
and
ZF Group, and spearheaded the IHub Project Mobility on UAV-based Visual Remote Sensing for Civil Infrastructure Safety Assessment.
I spent the summer of 2020 working on Simultaneous Localization and Mapping (SLAM) for Level-5 Autonomy at Swaayatt Robots. Post this, I transitioned to a Software Engineering role at Amdocs, and alongside, collaborated with the Norwegian Biometrics Laboratory (NTNU, Norway) to conduct research on Image Super-Resolution problem.
I am always open to collaborations, research discussions, or just an interesting chat on AI & Robotics. Feel free to connect with me on LinkedIn or via email!
Interests
- Robotics & Computer Vision
- Deep Learning
- AI & Neuroscience
Education
-
MS in Robotics, August 2023- May 2025
Georgia Institute of Technology (Georgia Tech)
-
B.Tech in Electronics & Communication Engg, July 2016 - July 2020
Sardar Vallabhbhai National Institute of Technology, Surat
Recent
- May '25: Extended EgoMimic to study behavior and scene-level generalization from human data. Also, a part of active collaborative effort on collecting large-scale embodied human dat for robot learning (More details soon!)
- Feb '25: Meta AI did a story covering our work - EgoMimic. Check it out here. EgoMimic is also accepted at ICRA 2025
- Jan '25: Starting as Graduate Teaching Assistant for CS 3630: Intro to Perception and Robotics.
- Nov '24: Presented EgoMimic at Conference on Robot Learning (CoRL) 2024 (Munich, Germany)
- August '24: Graduate Teaching Assistant for CS6476: Computer Vision
- May '24: Summer Intern at Honda Research Institute, USA.
- Jan '24: Graduate Teaching Assistant for CS4476: Intro to Computer Vision.
- Nov '23: Started working with Prof. Danfei Xu at the Robot Learning and Reasoning Lab (RL2), Georgia Tech.
- August '23: Started MS in Robotics at Georgia Tech! Graduate Teaching Assistant for CS6476: Computer Vision.
- May '23: Proposal accepted at Google Summer of Code (GSoC)!
- May '23: Presented GDIP at ICRA 2023.
- Nov '22: Presented SRTGAN at CVIP 2022.
- Oct '22: Presented UVRSABI as a spotlight paper at CVCIE workshop, ECCV 2022.
- Sept '22: UVRSABI released as an open-source software at IHub Data Mobility Summit 2022 and would be used for deployment of civil inspection by CRRI (Govt. of India).
Work Experience
Graduate Student Researcher
Robot Learning and Reasoning Lab (RL2), Georgia Tech
Cross-Embodiment Learning for Robot Manipulation using Embodied Human-Play Data [WebPage]
Research Associate Intern
Honda Research Institute, USA
Open-Source Software Developer
Google Summer of Code 2023
- Google Summer of Code '23 Contributor at Ivy - unify.ai
- Developed a multi-backend framework support (PyTorch, JAX, NumPy, Tensorflow) for GradSLAM library in Ivy, with an aim to optimize deployment through highly efficient frameworks like JAX.
Project Associate
Robotics Research Centre (RRC), IIIT Hyderabad
- Collaborated with ZF Friedrichshafen group and Queensland University of Technology (QUT) Robotics on improving perception and scene understanding for adverse weather conditions.
- Proposed GDIP: Gated Differentiable Image Processing which establishes a new SOTA for object detection in foggy and low-lighting conditions.
- Researched downstream problems like video object detection/tracking and explored Probabilistic Graphical Models (PGMs) for weather-agnostic feature refinement.
- Automated assessment of civil structures with the help of visual remote sensing.
- Utilized Structure-from-Motion, state estimation, odometry, etc., in conjunction with classical Computer Vision and Deep Learning-based visual inspection algorithms to robustly estimate critical structural parameters.
- Developed and released an open-source library (UVRSABI) for the community. More details here.
Software Engineer
AMDOCS
- Responsible for B2B production-level full-stack software development.
- Developed cross-functional telecom software solutions for Comcast's Orion project (USA).
- Technical Stack: Java, ReactJS, SQL, Spring Boot, Maven, and Jenkins.
Research Intern
Swaayatt Robots
- Improved Visual Odometry and SLAM pipelines for Level-5 Autonomy.
- Devised a semantic variant of the Iterative Closest Point (ICP) algorithm, outperforming vanilla ICP in terms of matching loss and convergence time on the Semantic KITTI dataset.
- Developed a low-level C++ library.
Deep Learning Intern
Sardar Vallabhbhai National Institute Of Technology, Surat
- Implemented the state-of-the-art FaceNet paper and validated it on a custom dataset of 25 students.
Publications
Conference on Robot Learning (CoRL) 2024 Workshops: Learning Fine/Dexterous Manipulation and X-Embodiment
International Conference on Robotics and Automation (ICRA) 2023
Spotlight paper at European Conference on Computer Vision (ECCV) Workshop 2022
7th International Conference on Computer Vision & Image Processing (CVIP) 2022
Projects
Zero-shot policy adaptation: Diffusion Models + LLMs
Zero-shot adaptation leveraging inference-time flexibility (diffusion models) and expressivity (LLMs)
EgoMimic: Scaling Imitation Learning via Egocentric Video
End-to-end robot learning for generalizable bimanual robot manipulation from embodied human data
Asia-Pacific Robotics Contest 2018 & 2019
Autonomous Navigation for OmniDrive and Quadruped robots
Behavior Cloning and Dynamics Models for Robot Manipulation
MLP, RNN, Diffusion policy variants and learning dynamics models in Robomimic
Vision-Language Models for Dense Feedback Reward
VLMs for Natural Language Human Feedback
UAV-based Assessment of Civil Structures
Automated building inspection using the aerial images captured using UAV.
Obstacle Avoidance for UAV
Predicting an obstacle-free patch for high level control commands
Fytbuddy: A real-time gym fitness trainer
A web app-based e-trainer using a flask web server and a Deep Learning-based model for posture correction
Autonomous Agricultural Robot
AGRIBOT to solve crop weed classification problem
RFID System
Identification system using RFID reader, LCD display and Atmel AVR microcontroller.
Image Super-Resolution
A triplet loss-based optimization framework for Image Super-Resolution
Mapping for level-5 Autonomy
Improved point cloud registration and mapping using semantic ICP
Object Detection in Adverse weather setting
Gated Differentiable Image Processing (GDIP), a domain-agnostic architecture for object detection in adverse conditions.
Implementation of Path Searching/Tracking algorithms
Implemented path search/track algorithms like Pure Pursuit, Djikstra, A-star etc.
Teaching
Featured
- UAV-based Visual Remote Sensing for Automated Building Inspection (UVRSABI)
- Spotlight paper presentation at the CVCIE Workshop at ECCV 2022
- Inaugurated by Central Road Research Institute to deploy in Telangana, India (Sept 2022)
- AGRIBOT got funded under TEQIP-III program by Govt. of India, and we also presented at the open-source ROS Agriculture community meet. [YouTube]
- Secured 12th and 13th rank in Asia-Pacific Robot Contest - RoboCon 2018 and 2019 respectively, among 100-plus universities. [RoboCon2019 YouTube] [RoboCon2018 YouTube]
- Best Working Model - Stirling Engine at the National Science Day Celebrations, Physical Research Laboratory (PRL), India, during 12th grade.