| CARVIEW |
Lingdong Kong
I am a Ph.D. candidate in the Department of Computer Science at the National University of Singapore, advised by Prof. Wei Tsang Ooi, Prof. Benoit Cottereau, and Dr. Lai Xing Ng. I also collaborate closely with Prof. Ziwei Liu from Nanyang Technological University, Singapore.
My research focuses include spatial intelligence, multimodal large language models, and 3D/4D world modeling and evaluations.
I am the recipient of the Research Achievement Award (NUS Computing, 2023), Dean's Graduate Research Excellence Award (NUS Computing, 2024), DAAD AInet Fellowship (DAAD, 2025), and Apple Scholars in AI/ML Ph.D. Fellowship (Apple, 2025).
I have been fortunate to collaborate with Apple Machine Learning Research, NVIDIA Research, OpenMMLab, MMLab@NTU, and Motional.
News
- [11/2025] - LiDARCrafter was selected for oral presentation at AAAI 2026.
- [11/2025] - Selected for Global Young Scientists Summit (GYSS) 2026.
- [09/2025] - Talk2Event was selected as a spotlight at NeurIPS 2025.
- [09/2025] - Selected for Doctoral Consortium at ICCV 2025.
- [06/2025] - We are hosting the RoboSense Challenge at IROS 2025.
- [05/2025] - Recognized as an Outstanding Reviewer by CVPR 2025.
-
[03/2025] - Selected as an Apple Ph.D. Scholar in AI/ML by
Apple.
- [02/2025] - DynamicCity was selected as a spotlight at ICLR 2025.
- [01/2025] - Calib3D was selected for oral presentation at WACV 2025.
- [09/2024] - Place3D was selected as a spotlight at NeurIPS 2024.
- [04/2024] - OpenESS was selected as a highlight at CVPR 2024.
- [11/2023] - We are hosting the RoboDrive Challenge at ICRA 2024.
- [09/2023] - Seal was selected as a spotlight at NeurIPS 2023.
- [03/2023] - LaserMix was selected as a highlight at CVPR 2023.
Industrial Experience
|
Apple AI/MLPh.D. Scholar
Mentor: Dr. Afshin Dehghan
|
|
CNRS@CREATEDesCartes Ph.D. Scholar
Mentor: Prof. Benoit R. Cottereau
|
|
NVIDIA ResearchResearch Intern
Mentor: Dr. Boris Ivanovic
|
|
TikTokResearch Scientist Intern
Mentor: Dr. Pengfei Wei
|
|
MotionalAutonomous Vehicle Intern
Mentor: Dr. Venice Erin Liong
|
Recent Publications
* equal contributions ‡ project lead § corresponding author
|
AD-R1: Closed-Loop Reinforcement Learning for End-to-End Autonomous Driving with Impartial World Models
Tianyi Yan,
Tao Tang,
Xingtai Cui,
Yongkang Li,
...,
Lingdong Kong,
et al.
Preprint, 2026
|
|
LiDARCrafter: Dynamic 4D World Modeling from LiDAR Sequences
Ao Liang,
Youquan Liu,
Yu Yang,
Dongyue Lu,
Linfeng Li,
Lingdong Kong‡,
et al.
|
|
La La LiDAR: Large-Scale Layout Generation from LiDAR Data
Youquan Liu,
Lingdong Kong,
Weidong Yang,
Xin Li,
Ao Liang,
Runnan Chen,
Ben Fei,
Tongliang Liu
|
|
RewardMap: Tackling Sparse Rewards in Fine-Grained Visual Reasoning via Multi-Stage Reinforcement Learning
Sicheng Feng*,
Kaiwen Tuo*,
Song Wang,
Lingdong Kong,
Jianke Zhu,
Huan Wang
Preprint, 2025
|
|
EditMGT: Unleashing Potentials of Masked Generative Transformers in Image Editing
Wei Chow,
Linfeng Li,
Lingdong Kong,
Zefeng Li,
Qi Xu,
Hang Song,
Tian Ye,
et al.
Preprint, 2025
|
|
Can MLLMs Guide Me Home? A Benchmark Study on Fine-Grained Visual Reasoning from Transit Maps
Sicheng Feng*,
Song Wang*,
Shuyi Ouyang,
Lingdong Kong,
et al.
Preprint, 2025
|
|
Stairway to Success: Zero-Shot Floor-Aware Object-Goal Navigation via LLM-Driven Coarse-to-Fine Exploration
Zeying Gong,
Rong Li,
Tianshuai Hu,
Ronghe Qiu,
Lingdong Kong,
et al.
Preprint, 2025
|
|
|
PixelThink: Towards Efficient Chain-of-Pixel Reasoning
Song Wang,
Gongfan Fang,
Lingdong Kong,
Xiangtai Li,
Jianyun Xu,
Sheng Yang,
Qiang Li,
Jianke Zhu,
Xinchao Wang
Preprint, 2025
|
|
Talk2Event: Grounded Understanding of Dynamic Scenes from Event Cameras
Lingdong Kong*,
Dongyue Lu*,
Ao Liang*,
Rong Li,
Yuhao Dong,
et al.
|
|
VideoLucy: Deep Memory Backtracking for Long Video Understanding
Jialong Zuo,
Yongtai Deng,
Lingdong Kong,
Jingkang Yang,
Rui Jin,
Yiwei Zhang,
Nong Sang,
Liang Pan,
Ziwei Liu,
Changxin Gao
|
|
3EED: Ground Everything Everywhere in 3D
Rong Li*,
Yuhao Dong*,
Tianshuai Hu*,
Ao Liang*,
Youquan Liu*,
Dongyue Lu*,
Liang Pan,
Lingdong Kong‡,
Junwei Liang,
Ziwei Liu
|
|
SPIRAL: Semantic-Aware Progressive LiDAR Scene Generation and Understanding
Dekai Zhu*,
Yixuan Hu*,
Youquan Liu,
Dongyue Lu,
Lingdong Kong,
Slobodan Ilic
|
|
FlexEvent: Towards Flexible Event-Frame Object Detection at Varying Operational Frequencies
Dongyue Lu,
Lingdong Kong,
Gim Hee Lee,
Camille Simon Chane,
Wei Tsang Ooi
|
|
Perspective-Invariant 3D Object Detection
Ao Liang*,
Lingdong Kong*,
Dongyue Lu*,
Youquan Liu,
Jian Fang,
Huaici Zhao,
Wei Tsang Ooi
|
|
Are VLMs Ready for Autonomous Driving? An Empirical Study from the Reliability, Data, and Metric Perspectives
Shaoyuan Xie,
Lingdong Kong‡,
Yuhao Dong,
Chonghao Sima,
et al.
|
|
Beyond One Shot, Beyond One Perspective: Cross-View and Long-Horizon Distillation for Better LiDAR Representations
Xiang Xu,
Lingdong Kong‡,
Song Wang,
Chuanwei Zhou,
Qingshan Liu
|
|
MonoMRN: Monocular Semantic Scene Completion via Masked Recurrent Networks
Xuzhi Wang,
Xinran Wu,
Song Wang,
Lingdong Kong§,
Ziping Zhao
|
|
SafeMap: Robust HD Map Construction from Incomplete Observations
Xiaoshuai Hao,
Lingdong Kong,
Rong Yin,
Pengwei Wang,
Jing Zhang,
Yunfeng Diao,
Shu Zhao
|
|
EventFly: Event Camera Perception from Ground to the Sky
Lingdong Kong,
Dongyue Lu,
Xiang Xu,
Lai Xing Ng,
Wei Tsang Ooi,
Benoit R. Cottereau
|
|
LiMoE: Mixture of LiDAR Data Representation Learners from Automotive Scenes
Xiang Xu*,
Lingdong Kong*,
Hui Shuai,
Liang Pan,
Ziwei Liu,
Qingshan Liu
|
|
GEAL: Generalizable 3D Object Affordance Learning with Cross-Modal Consistency
Dongyue Lu,
Lingdong Kong,
Tianxin Huang,
Gim Hee Lee
|
|
SeeGround: See and Ground for Zero-Shot Open-Vocabulary 3D Visual Grounding
Rong Li,
Shijie Li,
Lingdong Kong,
Xulei Yang,
Junwei Liang
|
|
PointLoRA: Low-Rank Adaptation with Token Selection for Point Cloud Learning
Song Wang,
Xiaolu Liu,
Lingdong Kong,
Jianyun Xu,
Chunyong Hu,
et al.
|
|
DynamicCity: Large-Scale 4D Occupancy Generation from Dynamic Scenes
Hengwei Bian,
Lingdong Kong,
Haozhe Xie,
Liang Pan,
Yu Qiao,
Ziwei Liu
|
|
Calib3D: Calibrating Model Preferences for Reliable 3D Scene Understanding
Lingdong Kong*,
Xiang Xu*,
Jun Cen,
Wenwei Zhang,
Kai Chen,
Ziwei Liu
|
|
LargeAD: Large-Scale Cross-Sensor Data Pretraining for Autonomous Driving
Lingdong Kong*,
Xiang Xu*,
Youquan Liu*,
Jun Cen,
Runnan Chen,
Wenwei Zhang,
Liang Pan,
Kai Chen,
Ziwei Liu
|
|
Multi-Modal Data-Efficient 3D Scene Understanding for Autonomous Driving
Lingdong Kong,
Xiang Xu,
Jiawei Ren,
Wenwei Zhang,
Liang Pan,
Kai Chen,
Wei Tsang Ooi,
Ziwei Liu
|
|
FRNet: Frustum-Range Networks for Scalable LiDAR-Based Semantic Segmentation
Xiang Xu,
Lingdong Kong,
Hui Shuai,
Qingshan Liu
|
|
NUC-Net: Non-Uniform Cylindrical Partition Networks for Efficient LiDAR Semantic Segmentation
Xuzhi Wang,
Wei Feng,
Lingdong Kong,
Liang Wan
|
|
Visual Foundation Models Boost Cross-Modal Unsupervised Domain Adaptation for 3D Semantic Segmentation
Jingyi Xu,
Weidong Yang,
Lingdong Kong,
Youquan Liu,
et al.
|
|
Is Your LiDAR Placement Optimized for 3D Scene Understanding?
Ye Li,
Lingdong Kong,
Hanjiang Hu,
Xiaohao Xu,
Xiaonan Huang
|
|
Is Your HD Map Constructor Reliable under Sensor Corruptions?
Xiaoshuai Hao,
Mengchuan Wei,
Yifan Yang,
Haimei Zhao,
Hui Zhang,
Yi Zhou,
Qiang Wang,
Weiming Li,
Lingdong Kong§,
Jing Zhang
|
|
4D Contrastive Superflows are Dense 3D Representation Learners
Xiang Xu*,
Lingdong Kong*,
Hui Shuai,
Wenwei Zhang,
Liang Pan,
Kai Chen,
Ziwei Liu,
Qingshan Liu
|
|
Learning to Adapt SAM for Segmenting Cross-Domain Point Clouds
Xidong Peng,
Runnan Chen,
Feng Qiao,
Lingdong Kong,
Youquan Liu,
Tai Wang,
Xinge Zhu,
Yuexin Ma
|
|
OpenESS: Event-Based Semantic Scene Understanding with Open Vocabularies
Lingdong Kong,
Youquan Liu,
Lai Xing Ng,
Benoit R. Cottereau,
Wei Tsang Ooi
|
|
Multi-Space Alignments Towards Universal LiDAR Segmentation
Youquan Liu*,
Lingdong Kong*,
Xiaoyang Wu,
Runnan Chen,
Xin Li,
Liang Pan,
Ziwei Liu,
Yuexin Ma
|
|
Unified 3D and 4D Panoptic Segmentation via Dynamic Shifting Networks
Fangzhou Hong,
Lingdong Kong,
Hui Zhou,
Xingge Zhu,
Hongsheng Li,
Ziwei Liu |
|
Benchmarking and Improving Bird's Eye View Perception Robustness in Autonomous Driving
Shaoyuan Xie,
Lingdong Kong,
Wenwei Zhang,
Jiawei Ren,
et al.
|
|
RoboDepth: Robust Out-of-Distribution Depth Estimation under Corruptions
Lingdong Kong,
Shaoyuan Xie,
Hanjiang Hu,
Lai Xing Ng,
Benoit R. Cottereau,
Wei Tsang Ooi
|
|
Segment Any Point Cloud Sequences by Distilling Vision Foundation Models
Youquan Liu*,
Lingdong Kong*,
Jun Cen,
Runnan Chen,
Wenwei Zhang,
et al.
|
|
Unsupervised Video Domain Adaptation for Action Recognition: A Disentanglement Perspective
Pengfei Wei,
Lingdong Kong,
Xinghua Qu,
Yi Ren,
Zhiqiang Xu,
et al.
|
|
Towards Label-Free Scene Understanding by Vision Foundation Models
Runnan Chen,
Youquan Liu,
Lingdong Kong,
Nenglun Chen,
Xinge Zhu,
Yuexin Ma,
Tongliang Liu,
Wenping Wang
|
|
Robo3D: Towards Robust and Reliable 3D Perception against Corruptions
Lingdong Kong*,
Youquan Liu*,
Xin Li*,
Runnan Chen,
Wenwei Zhang,
Jiawei Ren,
Liang Pan,
Kai Chen,
Ziwei Liu
|
|
Rethinking Range View Representation for LiDAR Segmentation
Lingdong Kong,
Youquan Liu,
Runnan Chen,
Yuexin Ma,
Xinge Zhu,
Yikang Li,
Yuenan Hou,
Yu Qiao,
Ziwei Liu
|
|
UniSeg: A Unified Multi-Modal LiDAR Segmentation Network and the OpenPCSeg Codebase
Youquan Liu,
Runnan Chen,
Xin Li,
Lingdong Kong,
Yuchen Yang,
et al.
|
|
LaserMix for Semi-Supervised LiDAR Semantic Segmentation
Lingdong Kong*,
Jiawei Ren*,
Liang Pan,
Ziwei Liu
|
|
CLIP2Scene: Towards Label-Efficient 3D Scene Understanding by CLIP
Runnan Chen,
Youquan Liu,
Lingdong Kong,
Xinge Zhu,
Yuexin Ma,
Yikang Li,
Yuenan Hou,
Yu Qiao,
Wenping Wang
|
|
ConDA: Unsupervised Domain Adaptation for LiDAR Segmentation via Regularized Domain Concatenation
Lingdong Kong,
Niamul Quader,
Venice Erin Liong
|
|
Benchmarking 3D Robustness to Common Corruptions and Sensor Failure
Lingdong Kong*,
Youquan Liu*,
Xin Li*,
Runnan Chen,
Wenwei Zhang,
et al.
|
Tech Reports
|
The RoboSense Challenge: Sense Anything, Navigate Anywhere, Adapt Across Platforms
Lingdong Kong,
Shaoyuan Xie,
Zeying Gong,
Ye Li,
Meng Chu,
Ao Liang,
et al.
Technical Report, 2025
|
|
The RoboDepth Challenge: Methods and Advancements Towards Robust Depth Estimation
Lingdong Kong,
Yaru Niu,
Shaoyuan Xie,
Hanjiang Hu,
Lai Xing Ng,
et al.
Technical Report, 2023
|
Workshop Organizers
Academic Services
Conference Reviewer
- IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
- IEEE/CVF International Conference on Computer Vision (ICCV)
- IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)
- European Conference on Computer Vision (ECCV)
- Conference on Neural Information Processing Systems (NeurIPS)
- International Conference on Learning Representations (ICLR)
- International Conference on Machine Learning (ICML)
- IEEE International Conference on Robotics and Automation (ICRA)
- IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)
- AAAI Conference on Artificial Intelligence (AAAI)
Journal Reviewer
- International Journal of Computer Vision (IJCV)
- International Journal of Robotics Research (IJRR)
- IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI)
- IEEE Transactions on Neural Networks and Learning Systems (TNNLS)
- IEEE Transactions on Intelligent Vehicles (TIV)
- IEEE Transactions on Intelligent Transportation Systems (TITS)
- IEEE Transactions on Circuits and Systems for Video Technology (TCSVT)
- IEEE Transactions on Multimedia (TMM)
- IEEE Transactions on Knowledge and Data Engineering (TKDE)
- IEEE Robotics and Automation Letters (RA-L)
- ISPRS Journal of Photogrammetry and Remote Sensing (P&RS)
- © Last Updated: November 2025





























































