Minting Pan · Yitao Zheng · Jiajian Li · Yunbo Wang · Xiaokang Yang
Offline reinforcement learning (RL) enables policy optimization using static datasets, avoiding the risks and costs of extensive real-world exploration. However, it struggles with suboptimal offline behaviors and inaccurate value estimation due to the lack of environmental interaction. We present Video-Enhanced Offline RL (VeoRL), a model-based method that constructs an interactive world model from diverse, unlabeled video data readily available online. Leveraging model-based behavior guidance, our approach transfers commonsense knowledge of control policy and physical dynamics from natural videos to the RL agent within the target domain. VeoRL achieves substantial performance gains (over 100% in some cases) across visual control tasks in robotic manipulation, autonomous driving, and open-world video games.
conda env create -f set_up/metaworld.yaml
- Create an environment
conda env create -f set_up/carla.yaml
- Download and setup CARLA 0.9.10
chmod +x set_up/setup_carla.sh
./setup_carla.sh
- Add to your python path:
export PYTHONPATH=$PYTHONPATH:/home/CARLA_0.9.10/PythonAPI
export PYTHONPATH=$PYTHONPATH:/home/CARLA_0.9.10/PythonAPI/carla
export PYTHONPATH=$PYTHONPATH:/home/CARLA_0.9.10/PythonAPI/carla/dist/carla-0.9.10-py3.7-linux-x86_64.egg
- merge the directories, i.e., put 'carla_env_dream.py' into 'CARLA_0.9.10/PythonAPI/carla/agents/navigation/'.
- Create an environment
conda create -n minedojo python=3.9
conda activate minedojo
-
Install Java: JDK
1.8.0_171. Then install the MineDojo environment and MineCLIP following their official documents. -
Install dependencies
pip install -r set_up/requirements.txt -
Download the MineCLIP weight here and place them at
./weights/mineclip_attn.pth.
python dreamer.py --logdir path/to/log --config defaults metaworld --task metaworld_handle_press --target_dataset_logdir path/to/offline_dataset --source_video_logdir path/to/video
Terminal 1:
cd CARLA_0.9.10
bash CarlaUE4.sh -fps 20 -opengl
Terminal 2:
python dreamer.py --logdir path/to/log --config defaults carla --target_dataset_logdir path/to/offline_dataset --source_video_logdir path/to/video
When running other environments where the CARLA simulator is not deployed, it may be necessary to comment out the line 'from agents.navigation.carla_env_dreamer import CarlaEnv' in the dreamer.py file.
python dreamer.py --logdir path/to/log --config defaults minedojo --task minedojo_harvest_log_in_plains --target_dataset_logdir path/to/offline_dataset --source_video_logdir path/to/video
To evaluate the pretrained model:
python test.py --logdir path/to/log --config defaults minedojo --task minedojo_harvest_log_in_plains --eval_pretrained_model path/to/pretrained_model
We provide pretrained weights of VeoRL for the tasks mentioned in the paper. You can download them using the links in the table below:
| Task Name | Weight File |
|---|---|
| metaworld_drawer_open | drawer_open.pt |
| metaworld_handle_press | handle_press.pt |
| metaworld_handle_pull | handle_pull.pt |
| metaworld_plate_slide | plate_slide.pt |
| metaworld_coffee_push | coffee_push.pt |
| metaworld_butten_press | butten_press.pt |
| carla | carla.pt |
| minedojo_harvest_log_in_plains | log.pt |
| minedojo_harvest_water_with_bucket | water.pt |
| minedojo_harvest_sand | sand.pt |
We provide source and target datasets used in the paper. You can download in this link.
If you find this repo useful, please cite our paper:
@inproceedings{pan2025veorl,
title={Video-Enhanced Offline Reinforcement Learning: A Model-Based Approach},
author={Minting Pan and Yitao Zheng and Jiajian Li and Yunbo Wang and Xiaokang Yang},
booktitle={ICML},
year={2025}
}The codes refer to the implemention of dreamer-torch. Thanks for the authors!
