You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In this work, we introduce a novel zero-shot VLN framework. Within this framework, large models possessing distinct abilities are served as domain experts. Our proposed navigation agent, namely DiscussNav, can actively discuss with these experts to collect essential information before moving at every step. These discussions cover critical navigation subtasks like instruction understanding, environment perception, and completion estimation. The performances on the representative VLN task R2R show that our method surpasses the leading zero-shot VLN model by a large margin on all metrics.
We have prepared R2R Val Unseen data in the tasks/data directory.
Run DiscussNav
python DiscussNav.py
BibTex
Please cite our paper if you find it helpful :)
@article{long2023discuss,
title={Discuss before moving: Visual language navigation via multi-expert discussions},
author={Long, Yuxing and Li, Xiaoqi and Cai, Wenzhe and Dong, Hao},
journal={arXiv preprint arXiv:2309.11382},
year={2023}
}