e2vla

Project under refactoring

Checkpoints, evaluation results and videos have been uploaded to google drive:

Checkpoints: link
LIBERO scores: link
LIBERO videos: link

Dependency

Pretrain

1. Dataset Preparation

Droid

We use the processed data from cadence/droid_1.0.1 as it has camera extrinsic attached. Download it to anywhere you like, and make a symbolic link to it as ./data_raw/droid_1.0.1. Then run:
```
conda activate lerobot
python data_prepare/process_droid.py \
  --input_root ./data_raw/droid_1.0.1 \
  --alter_vid_root VIDEO_DOWNLOAD_PATH \
  --output_root ./data_converted/droid \
  --skip_saved
```
Note:
- This requires lerobot installed. We use version 0.1.0. You may need to create a new conda environment (e.g. named lerobot) and install the package via:
```
pip install "lerobot==0.1.0"
```
- The initial downloads of video files may be incomplete (test at 2025/04). We need to download the full video files and place them at VIDEO_DOWNLOAD_PATH. TODO: upload scripts to fix this.

Maniskill

First download the data to anywhere you like, e.g.:

mkdir -p ANYWHERE/maniskill
gsutil -m cp -r gs://gresearch/robotics/maniskill_dataset_converted_externally_to_rlds/0.1.0 ANYWHERE/maniskill
ln -s ANYWHERE/maniskill ./data_raw/maniskill

Then run:

conda activate tensorflow
python data_prepare/process_maniskill.py \
  --input_root ./data_raw/maniskill/0.1.0 \
  --output_root ./data_converted/maniskill/0.1.0 \
  --visualize

Note:

This requires tensorflow installed. You may need to create a new conda environment (e.g. named tensorflow) to install it and run the above command to generate data.

Metaworld

This doesn't require downloading extra data. However, you may still need to create a new conda environment (e.g. named metaworld-v3) and then install the metaworld package via:
```
pip install "metaworld==2.0.0"
```
Then run:
```
conda activate metaworld-v3
python data_prepare/process_metaworld.py \
  --output_root ./data_converted/metaworld \
  --visualize \
  --skip_saved
```
Note:
- Although we install "metaworld==2.0.0", it is actually version 3.

If you have downloaded and processed all the data, the file structure would be like this:

data_raw
├── droid_1.0.1
│   ├── README.md
│   ├── data
│   │   ├── chunk-000
│   │   ├── chunk-001
│   │   └── ...
│   ├── meta
│   │   ├── episodes.jsonl
│   │   ├── episodes_stats.jsonl
│   │   ├── info.json
│   │   └── tasks.jsonl
│   └── videos
│       ├── chunk-000
│       ├── chunk-001
│       └── ...
├── libero
│   ├── datasets
│   ├── libero_10
│   │   ├── KITCHEN_SCENE3_turn_on_the_stove_and_put_the_moka_pot_on_it_demo.hdf5
│   │   ├── KITCHEN_SCENE4_put_the_black_bowl_in_the_bottom_drawer_of_the_cabinet_and_close_it_demo.hdf5
│   │   └── ...
│   ├── libero_90
│   │   ├── KITCHEN_SCENE10_close_the_top_drawer_of_the_cabinet_and_put_the_black_bowl_on_top_of_it_demo.hdf5
│   │   ├── KITCHEN_SCENE10_close_the_top_drawer_of_the_cabinet_demo.hdf5
│   │   └── ...
│   ├── libero_goal
│   │   ├── open_the_middle_drawer_of_the_cabinet_demo.hdf5
│   │   ├── open_the_top_drawer_and_put_the_bowl_inside_demo.hdf5
│   │   └── ...
│   ├── libero_object
│   │   ├── pick_up_the_alphabet_soup_and_place_it_in_the_basket_demo.hdf5
│   │   ├── pick_up_the_bbq_sauce_and_place_it_in_the_basket_demo.hdf5
│   │   └── ...
│   └── ...
└── maniskill
    └── 0.1.0
        ├── dataset_info.json
        ├── features.json
        ├── maniskill_dataset_converted_externally_to_rlds-train.tfrecord-00000-of-01024
        ├── maniskill_dataset_converted_externally_to_rlds-train.tfrecord-00001-of-01024
        └── ...
data_converted
├── drawer
│   ├── 0000.h5
│   ├── 0001.h5
│   └── ...
├── droid
│   ├── data
│   │   ├── chunk-000
│   │   ├── chunk-001
│   │   └── ...
│   └── videos
│       ├── chunk-000
│       ├── chunk-001
│       └── ...
├── libero
│   ├── libero_10_no_noops
│   │   ├── KITCHEN_SCENE3_turn_on_the_stove_and_put_the_moka_pot_on_it
│   │   ├── KITCHEN_SCENE4_put_the_black_bowl_in_the_bottom_drawer_of_the_cabinet_and_close_it
│   │   └── ...
│   ├── libero_90_no_noops (not used in fine-tuning)
│   │   ├── KITCHEN_SCENE10_close_the_top_drawer_of_the_cabinet
│   │   ├── KITCHEN_SCENE10_close_the_top_drawer_of_the_cabinet_and_put_the_black_bowl_on_top_of_it
│   │   └── ...
│   ├── libero_goal_no_noops
│   │   ├── open_the_middle_drawer_of_the_cabinet
│   │   ├── open_the_top_drawer_and_put_the_bowl_inside
│   │   └── ...
│   ├── libero_object_no_noops
│   │   ├── pick_up_the_alphabet_soup_and_place_it_in_the_basket
│   │   ├── pick_up_the_bbq_sauce_and_place_it_in_the_basket
│   │   └── ...
│   └── libero_spatial_no_noops
│       ├── pick_up_the_black_bowl_between_the_plate_and_the_ramekin_and_place_it_on_the_plate
│       ├── pick_up_the_black_bowl_from_table_center_and_place_it_on_the_plate
│       └── ...
├── maniskill
│   └── 0.1.0
│       ├── 00000.h5
│       ├── 00001.h5
│       └── ...
├── metaworld
│   ├── assembly-v3
│   │   ├── 0000.h5
│   │   ├── 0001.h5
│   │   └── ...
│   ├── basketball-v3
│   │   ├── 0000.h5
│   │   ├── 0001.h5
│   │   └── ...
│   └── ...
├── oven
│   ├── 0000.h5
│   ├── 0001.h5
│   └── ...
└── pick-place-can
    ├── 0000.h5
    ├── 0001.h5
    └── ...

(1.5) Data Visualization

Visualize the processed data is recommended before training. Run:

python datavis.py {DATASET_NAME}

to visualize the specified dataset. Run python datavis.py -l to list all the available datasets.

2. Start Pre-training

You can use python train.py -h to see the help message. To pretrain on the above three datasets, run:

CUDA_VISIBLE_DEVICES=x python train.py --config pretrain -s EXPERIMENT_NAME

To pretrain on all the datasets mentioned in paper, run:

CUDA_VISIBLE_DEVICES=x python train.py --config pretrain_extra -s EXPERIMENT_NAME

This will save the log to ./logs/E2VLA/EXPERIMENT_NAME and save the checkpoints to ./checkpoints/E2VLA/EXPERIMENT_NAME.

We have uploaded two checkpoints here.

Fine-tune and Evaluation on LIBERO

1. Dataset Preparation

First download the LIBERO dataset to anywhere and then make a symbolink to ./data_raw/libero. Then run

conda activate libero
python data_prepare/process_libero.py \
  --libero_task_suite libero_spatial \
  --libero_raw_data_dir ./data_raw/libero \
  --libero_target_dir ./data_converted/libero \
  --skip_saved \
  --visualize

Change the libero_spatial to [libero_object, libero_goal, libero_10] for finetuning and evaluation on other task-suites.

2. Fine-tuning

For example, if we wnat to fine-tune on libero-10 from pretrained models:

CUDA_VISIBLE_DEVICES=x python train.py \
  --config finetune_libero_10 \
  --pretrained_ckpt ./checkpoints/E2VLA/PRETRAIN_EXP_NAME/ckpt_xxxxxxx.pt \
  -s FINETUNE_EXPERIMENT_NAME

This will load the config and the pre-trained weights. The fine-tuned weights are saved to ./checkpoints/E2VLA/FINETUNE_EXPERIMENT_NAME/. We save the weights every 10k iterations by default.

Finetuned checkpoints can be found here.

3. Evaluation

First we need to launch the pyro4 naming server (something like roscore). Open a separate terminal and run:
```
pyro4-ns
```
By default the naming server runs on localhost:9090.

Launch planning service of your fine-tuned model:

CUDA_VISIBLE_DEVICES=x python -m infer_utils.remote_service \
  --ckpt ./checkpoints/E2VLA/FINETUNE_EXPERIMENT_NAME/ckpt_xxxxxxx.pt \
  --uri CUSTOM_URI_NAME

Start evaluation in simulation:
```
python -m examples.libero.eval \
  --task_suite libero_10 \
  --uri CUSTOM_URI_NAME \
  --save --video
```
The results are saved to ./eval_results/TASK_SUITE/URI/, and the videos are saved to ./eval_videos/TASK_SUITE/URI/

Evaluation results and videos using our fine-tuned checkpoints can be found here and here.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
data_prepare		data_prepare
data_utils		data_utils
examples/libero		examples/libero
infer_utils		infer_utils
models		models
scripts		scripts
shm_transport		shm_transport
train_utils		train_utils
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
configs.py		configs.py
datavis.py		datavis.py
envars.py		envars.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

e2vla

Dependency

Pretrain

1. Dataset Preparation

(1.5) Data Visualization

2. Start Pre-training

Fine-tune and Evaluation on LIBERO

1. Dataset Preparation

2. Fine-tuning

3. Evaluation

Fine-tune on Own Data

About

Uh oh!

Releases

Packages

Languages

License

hhcaz/e2vla

Folders and files

Latest commit

History

Repository files navigation

e2vla

Dependency

Pretrain

1. Dataset Preparation

(1.5) Data Visualization

2. Start Pre-training

Fine-tune and Evaluation on LIBERO

1. Dataset Preparation

2. Fine-tuning

3. Evaluation

Fine-tune on Own Data

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages