I'm interested in machine learning and computer vision applied to robotics. The focus of my research is to develop methods leveraging offline data that distill perceptual and behavioral priors into embodied agents, enabling them to learn robust policies from minimal environment interactions.
Although pre-training on a large amount of data is beneficial for robot learning, current paradigms only perform large-scale pretraining for visual representations, whereas representations for other modalities are trained from scratch. In contrast to the abundance of visual data, it is unclear what relevant internet-scale data may be used for pretraining other modalities such as tactile sensing. Such pretraining becomes increasingly crucial in the low-data regimes common in robotics applications. In this paper, we address this gap by using contact microphones as an alternative tactile sensor. Our key insight is that contact microphones capture inherently audio-based information, allowing us to leverage large-scale audio-visual pretraining to obtain representations that boost the performance of robotic manipulation. To the best of our knowledge, our method is the first approach leveraging \emph{large-scale multisensory pre-training} for robotic manipulation.
Teaching Assistant
(10-417/617) Intermediate Deep Learning - Professors Ruslan Salakhutdinov and Yuanzhi Li
Teaching Assistant
(CSCI158) Machine Learning - Professor David Kauchak
(CSCI101) Introduction to Langauges and Theory of Computation - Professor Kim Bruce
(CSCI062) Data Structures and Advanced Programming - Professors Anthony Clark and David Kauchak
(CSCI051) Introduction to Computer Science - Professors Tzu-Yi Chen and Eleanor Birrell
(ECON052) Principles of Microeconomics - Profesor Malte Dolde