Ye Li1
Wenzhao Zheng2
Xiaonan Huang1
Kurt Keutzer2
1University of Michigan, Ann Arbor
2University of California, Berkeley
UniDrive is a novel framework designed to address the challenge of generalizing perception models across multi-camera configurations.
- To the best of our knowledge, UniDrive presents the first comprehensive framework designed to generalize vision-centric 3D perception models across diverse camera configurations.
- We introduce a novel strategy that transforms images into a unified virtual camera space, enhancing the robustness for camera parameter variations.
- We propose a virtual configuration optimization strategy that minimizes projection error, improving model generalization with minimal performance degradation.
- We contribute a systematic data generation platform along with a 160,000 frames multi-camera dataset, and benchmark evaluating perception models across varying camera configurations.
Visit our project page to explore more examples. 🚙
- [2025.01] - UniDrive is accepted at ICLR 2025.
- [2024.10] - Our paper is available on arXiv.
- Data Preparation
- Camera Configuration
- UniDrive Benchmark
- TODO List
- Citation
- License
- Acknowledgements
The UniDrive dataset consists of a total of eight Camera Configurations which are inspired by existing self-driving configurations from autonomous vehicle companies.
Each Camera Configuration contains 4 to 8 sensors. For each Camera Configuration, the sub-dataset consists of 20,000 frames of samples, comprising 10,000 samples for training and 10,000 samples for validation.
![]() |
![]() |
![]() |
![]() |
|---|---|---|---|
| Town 1 | Town 3 | Town 4 | Town 6 |
We choose four maps (Towns 1, 3, 4, and 6) in CARLA v0.9.10 to collect point cloud data and generate ground truth information. For each map, we manually set 6 ego-vehicle routes to cover all roads with no roads overlapped. The frequency of the simulation is set to 20 Hz.
Our datasets are hosted by OpenDataLab.
OpenDataLab is a pioneering open data platform for the large AI model era, making datasets accessible. By using OpenDataLab, researchers can obtain free formatted datasets in various fields.
Kindly refer to DATA_PREPARE.md for the details to prepare the UniDrive dataset.
![]() |
![]() |
![]() |
![]() |
|---|---|---|---|
| 4x95 | 5x75 | 6x60 | 6x70 |
![]() |
![]() |
![]() |
![]() |
| 6x80a | 6x80b | 5x70+110 | 8x50 |
- Initial release. 🚀
- Add Camera Configuration benchmarks.
- Add more 3D perception models.
If you find this work helpful for your research, please kindly consider citing our papers:
@inproceedings{li2024unidrive,
title={UniDrive: Towards Universal Driving Perception Across Camera Configurations},
author={Li, Ye and Zheng, Wenzhao and Huang, Xiaonan and Keutzer, Kurt},
booktitle={International Conference on Learning Representations (ICLR)},
year={2025}
}This work is under the MIT License, while some specific implementations in this codebase might be with other licenses.
This work is developed based on the MMDetection3D codebase.
MMDetection3D is an open-source toolbox based on PyTorch, towards the next-generation platform for general 3D perception. It is a part of the OpenMMLab project developed by MMLab.














