I am a 4th year undergraduate student studying computer science with a mathematics minor and pursuing academic research.
I am interested in perception, whether it’s by machines or humans or whether it’s via colours, depths, or something else. The perception I wish to study is not just the mere encounter with external stimuli but also the filtering, summarizing, experiencing, and so many more cognitive and computational processes that accompany it, whether it’d be naturally occurring in humans or ingeniously implemented in machines.
In my free time, I love learning languages and the linguistic facts about them. Thanks to my upbringing, I acquired Mandarin Chinese and English across Hangzhou and Vancouver. I then learned Spanish in school, and now I am learning Japanese and French for fun. It does pay off massively for cultural understanding when travelling, which is why I love the latter also!
My favourite sports are tennis and skiing/snowboarding.
Real-world applications of Visual Place Recognition (VPR) often rely on cloud computing, where query images or videos are transmitted elsewhere for visual localization. However, limited network bandwidth forces a reduction in image quality, which degrades global image descriptors and consequently VPR accuracy. We address this issue at the descriptor extraction level with a knowledge-distillation framework that transfers feature representations from high-quality images to low-quality ones, allowing VPR methods to produce more discriminative descriptors from the latter. Our approach leverages three complementary loss functions the Inter-channel Correlation Knowledge Distillation (ICKD) loss, Mean Squared Error (MSE) loss, and a weakly supervised Triplet loss to guide a student network in approximating the high-quality features produced by a teacher network. Extensive experiments on multiple VPR methods and datasets subjected to JPEG compression, resolution reduction, and video quantization demonstrate significant improvements in recall rates. Furthermore, our work fills a gap in the literature by exploring the impact of video-based data on VPR performance, thereby contributing to more reliable place recognition in resource-constrained environments.