You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We release the HM-EQA dataset, which includes 500 questions about 267 scenes from the HM-3D dataset. They are available in data/.
Usage
First specify scene_data_path in the config files with the path to the downloaded HM3D train split, and specify hf_token to be your Hugging Face user access token.Running the script below for the first time will download the VLM model, which assumes access to a GPU with sufficient VRAM for the chosen VLM.
Run our method (VLM-semantic exploration) in Habitat-Sim:
python run_vlm_exp.py -cf cfg/vlm_exp.yaml
Run CLIP-based exploration in Habitat-Sim:
python run_clip_exp.py -cf cfg/clip_exp.yaml
Load a scene (with the question from our dataset) in Habitat-Sim:
python test_scene.py -cf cfg/test_scene.yaml
Scripts
We also share a few scripts that might be helpful:
script/sample_views_from_scene.py: for sampling random views in a scene in Habitat-Sim. We used such images for generating EQA questions with GPT4-V.
script/get_floor_height.py: for getting the height of the floors in each scene of the HM-3D dataset, which is not available from the original dataset.
script/get_questions_gpt4v.py: for generating EQA questions with GPT4-V with random views of the scene and fewshot examples.