Hi! I’m a PhD student in EECS at MIT CSAIL, advised by Sara Beery. My research is supported by the NSERC Postgraduate Scholarship and the MIT Generative AI Consortium Grant, and has previously been funded by the Google–MIT Grant and the MIT Viterbi Fellowship.
I work on computer vision and multi-modal learning, and I’m interested in making large AI models more adaptive and human-aligned for specialized, real-world applications. I’m broadly interested in AI systems that support scientific discovery, including applications in ecology and biodiversity monitoring.
Model Adaptation & Specialization: Methods for adapting large models to new tasks and domains, particularly by leveraging specialized or synthetic data.
Human-in-the-Loop Alignment: Integrating expert knowledge and feedback to build more reliable and steerable AI systems.
AI for Scientific Discovery: Developing robust and interpretable models to accelerate and augment scientific workflows.
We introduce a representation learning framework that leverages few-shot T2I diffusion–generated synthetic images to learn object-centric representations for diverse downstream tasks. Using reformulations of two benchmarks and a novel personalized dataset, we demonstrate significant gains across various downstream vision tasks, and analyze the factors driving these improvements.
SKP is a multi-task perception module which jointly optimizes keypoint and part-segmentation learning in order to improve generalizability of category-level objects for robotic grasping. This model demonstrates accurate performance in real data, while being trained purely in the synthetic domain.
CaDDN is an approach for joint depth estimation and object detection that uses a predicted categorical depth distribution for each pixel to project contextual feature information in 3D space. CaDDN ranked 1st amongst published monocular methods on KITTI and Waymo 3D object detection datasets at publication.
theses
Specialization of Vision Representations with Personalized Synthetic Data [SM Thesis] @ MIT, supervised by Sara Beery
Self-supervised Dense Representation Learning [BASc Thesis] @ U of T Eng Sci, supervised by Sanja Fidler
Thesis on the exploration of inter-image relationships for contrastive learning to investigate how context between images can be used to improve part-based dense representation learning.
Unsupervised Multimodal Representation Learning for Robot Controls [Research Project] @ RVL. With Yewon L., Pranit C.
Investigated how unsupervised representation learning could be utilized in an imitation learning (IL) task such as robot path following. In this project, we successfully trained a multimodal representation from unlabelled path traversal data, fine-tuned on minimal labeled data.
Analysis of Monocular 3D Object Detection Network Performance on Multiview Datasets [Research Presentation] @ TRAILab, funded by NSERC USRA Code
Designed and executed an analysis of effect of camera perspectives on object detection networks, identifying limitations of multi-view self-driving datasets and current state-of-art monocular networks. The performance of trained monocular networks on various views of multiview dataset was analyzed with respect to multiple factors including distance, rotation and level of occlusion.
Adversarial Examples for Multimodal Object Detection Networks [Research Project] @ RVL, funded by ESROP Dr Allison Mackay Award Code
Led the implementation of multi-sensor object detection neural networks to state-of-art performance using PyTorch for an adversarial examples project, to improve robustness of image and lidar perception models. Trained fasterRCNN on nuScenes self driving car dataset and implemented Pointfusion architecture based on the original paper.
Autonomous Rover Drive System [Competition] University/European/Canadian Rover Challenges Code
Spearheaded development of rover drive control software pipeline which includes joystick integration, configuration of I2C protocol for motor drivers, variable speed control and AR tag detection. Languages used were Python, C++ and ROS was utilized to handle communication between topics.
Robot Mail Delivery Localization System [Final Project], University of Toronto, ROB301 Introduction to Robotics. With Samantha U. Detailed Report / Code Coming Soon
A PID control system, bayesian localization and state measurement models were developed in order to enable the robot to navigate an unknown hallway to deliver mail at specified destinations. The project was completed virtually using the Turtlebot 3 Waffle Pi robot simulated in a Gazebo environment.
Accent Classification Network [Final Project], University of Toronto, APS360 Applied Fundamentals of Machine Learning. With Mingshi C., Catherine G., Rocco R. Detailed Report / Code Coming Soon
Motivated by challenges experienced with voice-controlled devices, our team developed a neural network speech accent classifier which takes an accented English phrase as input and correctly identifies the origin of that speaker’s accent. The network developed was a CRNN architecture with mixed convolutional and recurrent neural network layers which learned both temporal and characteristic features of the audio data. The audio was processed into Mel-frequency cepstral coefficients (MFCC) features for input.
Autonomous Electric Car Charger Robot [Final Project], University of Toronto, ESC301 Praxis III. With Daniel R. and Chan Y. Detailed Design Report
Designed, conceptualized and prototyped an autonomous robot which locates and traverses to a charging port on a car to plug in the charger, utilizing time-of-flight sensor, color sensor, and pi-camera. Developed a modular program in Python and C++, each responsible for sensor or chassis control; utilized embedded programing
Sourdough IR Fermentometer Autonomous Robot [Final Project], University of Toronto, ESC102 Praxis II. With Mingshi C., Daniela L., Peter S. Poster / Demo Video
Designed a fully automated robotics system to inform sourdough bakers when their dough is ready to be put into the oven, in collaboration with local Toronto bakers at Forno Cultura and Blackbird Bakeries. The system develops a custom yeast-growth model and while the dough is expanding, the robot uses a 2 axis linear stage, time of flight sensor and temperature sensor to track the dough growth and alerts the bakers when it is ready.
Chess AI [Class Project], University of Toronto, CSC190 Algorithms and Data Structures. Code
Developed an automatic chess player that while utilizes alpha-beta pruning on a min-max game tree in order to select the best move against a human opponent.
Original Gameboy Project: Bit's Adventures [Final Project], University of Toronto, MIE438 Microcontrollers and Embedded Processors. Code / Demo Video
Bit’s Adventure is an original interactive RPG for the Original Game Boy which was coded using gbdk package in C. The game can be played on Game Boy emulators such as mGBA.
interests/ hobbies
Music
I enjoy everything from classical to pop to alternative and have an Associate Diploma in piano performance. I used to play at community events (something I'd love to get back to!) and, in past lives, made brief (and mostly unsuccessful) attempts at classical guitar and the cello.
Cooking
I cook mostly because I like eating, but I really enjoy the process too! I post some of them on a small (and very amateur) cooking Instagram — and I’m always open to new recipe ideas.
Nature
I love being outdoors in almost any form — hiking, camping, paddling, or just quietly watching wildlife. I’ve gotten into birding and wildlife photography as a hobbyist (my sister on the other hand is a professional. check her out), and one of my favorite recent discoveries has been backcountry canoe trips in Ontario (pictured above): both very peaceful and humbling.