You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
GPD-1 proposes a unified approach that seamlessly accomplishes multiple aspects of scene evolution, including scene simulation, traffic simulation, closed-loop simulation, map prediction, and motion planning, all without additional fine-tuning.
The pre-trained GPD-1 can accomplish various tasks without finetuning using different prompts.
Scene Simulation
Traffic Simulation
Closed-Loop Simulation
Map Prediction
Motion Planning
Overview
Our model adapts the GPT-like architecture for autonomous driving scenarios with two key innovations: 1) a 2D map scene tokenizer based on VQ-VAE that generates discrete, high-level representations of the 2D BEV map, and 2) a hierarchical quantization agent tokenizer to encode agent information.
Using a scene-level mask, the autoregressive transformer predicts future scenes by conditioning on both ground-truth and previously predicted scene tokens during training and inference, respectively.
Our work is inspired by these excellent open-sourced repos:
PlanTFSledgeNavsim
Citation
If you find this project helpful, please consider citing the following paper:
@article{gpd-1,
title={GPD-1: Generative Pre-training for Driving},
author={Xie, Zixun and Zuo, Sicheng and Zheng, Wenzhao and Zhang, Yunpeng and Du, Dalong and Zhou, Jie and Lu, Jiwen and Zhang, Shanghang},
journal={arXiv preprint arXiv:2412.08643},
year={2024}
}