Hi there! I’m a Ph.D. student in the Automated Driving and Human-Machine System Lab (AutoMan) at
Nanyang Technological University (NTU), where I’m advised by Prof. Chen Lv.
I’m also an AGS scholar in the Robotics and Autonomous Systems department, co-supervised by
Dr. Wei-Yun Yau at the Institute for Infocomm Research (I²R), A*STAR.
Before starting my Ph.D., I completed my B.Eng. (Hons) in Mechanical Engineering at NTU,
where I specialized in Robotics and Mechatronics.
My research interests lie in robot learning, with an emphasis on generalization across diverse tasks and environments.
I am particularly interested in how agents can acquire transferable and scalable policies that remain robust under distribution shifts,
unseen task variations, and dynamic multi-agent settings.
My recent work focuses on leveraging foundation models and structured reasoning to improve real-world robot navigation and decision-making,
particularly in vision-language tasks and zero-shot generalization.
2025.11 We won the Best Paper Award (First Prize) for our paper "COVLM-RL" at IEEE ITSC 2025!
2025.10 Our team ReasonX won the 2nd Place in CMU Vision-Language Autonomy Challenge and presented our work at IROS 2025!
2025.04 I will be joining the Safe AI Lab at CMU as a visiting student.
Cool Demos
Scalable Autonomy Stack
A full autonomy stack supporting both slow and fast walking/running modes across different robotic platforms.
The system integrates real-time mapping, path planning, terrain analysis, and
collision avoidance to enable smooth goal-directed navigation in real-world environments
while maintaining stable forward motion during turns.
This demo provides a practical platform for deploying and
evaluating high-level vision-language navigation (VLN) policies on legged robots. More
details can be found in here.
COVLM-RL integrates Critical Object reasoning with VLM-guided RL to generate semantic driving
priors and align them with low-level control.
It improves training stability, interpretability,
and boosts CARLA success rates by 30% in trained and 50% in unseen environments.
VLMLight is a vision-language-based traffic signal control (TSC) framework that leverages a safety-aware LLM meta-controller to dynamically switch between a fast RL policy and a structured reasoning branch. It introduces the first image-based traffic simulator with multi-view intersection perception, enabling real-time decision-making for both routine and critical scenarios. Experiments demonstrate up to 65% improvement in emergency vehicle response over RL-only systems.
A vision-language model (VLM)-driven framework that integrates structured chain-of-thought reasoning and
closed-loop feedback to enable zero-shot generalization in object navigation tasks.
We propose a novel transformer-based multi-agent reinforcement learning framework that enables generalizable and cooperative behavior among heterogeneous robot teams across diverse task settings.
We introduce a hierarchical multi-agent reinforcement learning framework that models both inter-vehicle interactions and traffic-level dynamics to achieve robust and cooperative control for autonomous vehicles in dense, heterogeneous traffic scenarios.
We propose a context-aware driver attention estimation framework that fuses gaze tracking, saliency detection, and semantic scene understanding across multiple hierarchical levels to improve prediction accuracy in real-world driving scenarios.
Academic Services
Journal Reviewer
IEEE Transactions on Intelligent Vehicles (T-IV), 2024
IEEE Transactions on Vehicular Technology (T-VT), 2023
IEEE Robotics and Automation Letters (RA-L), 2023-2025
Conference Reviewer
IEEE International Conference on Robotics and Automation (ICRA), 2024-2025
IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2023-2025
IEEE Intelligent Transportation Systems Conference (ITSC) 2024-2025