I am a Postdoctoral Research Fellow in the School of Mechanical and Aerospace Engineering at Nanyang Technological University (NTU), working with Jianfei Yang at the MARS Lab. I completed my PhD in 2025 at the University of Edinburgh under the supervision of Laura Sevilla, and collaborated closely with Varun Jampani and Deqing Sun.
The central focus of my research is to enable robots to perceive and interact with their environments in a human-like yet data-efficient manner. This includes exploring a range of cutting-edge topics:
Multimodal AI: VLM, MLLM
Embodied AI: Robot Learning, VLA, RL
Generative AI: Image / Video Generation, World Models
Efficient AI: Transfer Learning, Learning under Limited Data & Supervision
Human / Robot Interaction: Human-Object Interaction, Human-to-Robot Learning
📢 I am seeking highly self-motivated PhD, Master, and Bachelor students for potential reserach collaborations. Feel free to contact me via email if you are interested.
@inproceedings{Mask2IV,title={Mask2IV: Interaction-Centric Video Generation via Mask Trajectories},author={Li, Gen and Zhao, Bo and Yang, Jianfei and Sevilla-Lara, Laura},year={2026},booktitle={AAAI Conference on Artificial Intelligence},}
ICCV’25
Learning Precise Affordances from Egocentric Videos for Robotic Manipulation
Gen Li, Nikolaos Tsagkas, Jifei Song, Ruaridh Mon-Williams, Sethu Vijayakumar, and 2 more authors
In IEEE/CVF International Conference on Computer Vision, 2025
@inproceedings{Aff-Grasp,title={Learning Precise Affordances from Egocentric Videos for Robotic Manipulation},author={Li, Gen and Tsagkas, Nikolaos and Song, Jifei and Mon-Williams, Ruaridh and Vijayakumar, Sethu and Shao, Kun and Sevilla-Lara, Laura},year={2025},booktitle={IEEE/CVF International Conference on Computer Vision},}
@article{ELLMER,title={Embodied Large Language Models Enable Robots to Complete Complex Tasks in Unpredictable Environments},author={Mon-Williams, Ruaridh and Li, Gen and Long, Ran and Du, Wenqian and Lucas, Chris},journal={Nature Machine Intelligence},year={2025},}
CVPR’24
One-Shot Open Affordance Learning with Foundation Models
Gen Li, Deqing Sun, Laura Sevilla-Lara, and Varun Jampani
In IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024
@inproceedings{OOAL,title={One-Shot Open Affordance Learning with Foundation Models},author={Li, Gen and Sun, Deqing and Sevilla-Lara, Laura and Jampani, Varun},booktitle={IEEE/CVF Conference on Computer Vision and Pattern Recognition},year={2024},}
CVPR’23
LOCATE: Localize and Transfer Object Parts for Weakly Supervised Affordance Grounding
Gen Li, Varun Jampani, Deqing Sun, and Laura Sevilla-Lara
In IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023
@inproceedings{LOCATE,title={LOCATE: Localize and Transfer Object Parts for Weakly Supervised Affordance Grounding},author={Li, Gen and Jampani, Varun and Sun, Deqing and Sevilla-Lara, Laura},booktitle={IEEE/CVF Conference on Computer Vision and Pattern Recognition},year={2023},}
CVPR’21
Adaptive Prototype Learning and Allocation for Few-Shot Segmentation
Gen Li, Varun Jampani, Laura Sevilla-Lara, Deqing Sun, Jonghyun Kim, and 1 more author
In IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021
@inproceedings{ASGNet,title={Adaptive Prototype Learning and Allocation for Few-Shot Segmentation},author={Li, Gen and Jampani, Varun and Sevilla-Lara, Laura and Sun, Deqing and Kim, Jonghyun and Kim, Joongkyu},booktitle={IEEE/CVF Conference on Computer Vision and Pattern Recognition},pages={8334--8343},year={2021},}