You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[CVPR 2025] CityWalker: Learning Embodied Urban Navigation from Web-Scale Videos
TL;DR: CityWalker leverages thousands of hours of online city walking and driving videos to train autonomous agents for robust, generalizable navigation in dynamic urban environments through scalable, data-driven imitation learning.
The project should be compatible with latest Pytorch and CUDA versions. The code is tested with Python 3.11, PyTorch 2.5.0, and CUDA 12.1. To install the dependencies, run:
@inproceedings{liu2025citywalker,
title={Citywalker: Learning embodied urban navigation from web-scale videos},
author={Liu, Xinhao and Li, Jintong and Jiang, Yicheng and Sujay, Niranjan and Yang, Zhicheng and Zhang, Juexiao and Abanes, John and Zhang, Jing and Feng, Chen},
booktitle={Proceedings of the Computer Vision and Pattern Recognition Conference},
pages={6875--6885},
year={2025}
}
Acknowledgements
The work was supported by NSF grants 2238968, 2121391, 2322242 and 2345139; and in part through the NYU IT High Performance Computing resources, services, and staff expertise. We thank Xingyu Liu and Zixuan Hu for their help in data collection.
We also thank the authors of the following repositories for their open-source implementations: