| CARVIEW |
Experimental Results
Real World Videos Click to view all rollouts
We finetune various base VLA models (OpenVLA+OFT, π0-FAST, π0) to evaluate our method.
We compare AimBot against other visual augmentations, including Traces, RoboPoint, and raw depth images.
We visualize sample rollouts below and all rollouts from the paper for all models including OOD scenes are available here.
| Model | Fruits in Box |
Tennis Ball in Drawer |
Bread in Toaster |
Place Coffee Cup |
Egg in Carton |
Total Success |
|---|---|---|---|---|---|---|
| OpenVLA-OFT | 7/10 | 6/10 | 4/10 | 2/10 | 2/10 | 21/50 |
| OpenVLA-OFT + AimBot | 9/10 | 7/10 | 9/10 | 8/10 | 3/10 | 36/50 |
| π0-FAST | 10/10 | 10/10 | 9/10 | 7/10 | 6/10 | 42/50 |
| π0-FAST + AimBot | 10/10 | 10/10 | 10/10 | 9/10 | 8/10 | 47/50 |
| π0 | 7/10 | 7/10 | 4/10 | 5/10 | 4/10 | 27/50 |
| π0 + AimBot | 10/10 | 10/10 | 7/10 | 8/10 | 8/10 | 43/50 |
| π0 + Traces | 8/10 | 8/10 | 5/10 | 2/10 | 2/10 | 25/50 |
| π0 + RoboPoint | 8/10 | 9/10 | 4/10 | 6/10 | 0/10 | 27/50 |
| π0 + Depth Images | 7/10 | 9/10 | 5/10 | 7/10 | 4/10 | 32/50 |
Task Goal: Put the bread inside the toaster.
Failure: π0
Success: π0 + AimBot
Task: Put the cup on the coffee machine.
Failure: π0
Success: π0 + AimBot
Task: Put the tennis ball inside the drawer.
Failure: π0
Success: π0 + AimBot
Task: Put the eggs inside the egg carton.
Failure: π0
Success: π0 + AimBot
Task: Put all the fruits into the box.
Failure: π0
Success: π0 + AimBot
Simulation Videos on LIBERO
| Model | LIBERO Spatial |
LIBERO Object |
LIBERO Goal |
LIBERO Long |
Average Success Rate |
|---|---|---|---|---|---|
| OpenVLA-OFT | 96.2 | 97.3 | 93.9 | 87.5 | 93.8 |
| OpenVLA-OFT + AimBot | 95.2 (–1.0) | 99.1 (+1.8) | 94.2 (+0.3) | 91.2 (+3.7) | 95.0 (+1.2) |
| π0-FAST | 96.5 | 96.8 | 93.6 | 81.6 | 92.1 |
| π0-FAST + AimBot | 96.9 (+0.4) | 96.8 (+0.0) | 94.0 (+0.4) | 87.1 (+5.5) | 93.7 (+1.6) |
| π0 | 96.8 | 98.8 | 95.8 | 85.2 | 94.2 |
| π0 + AimBot | 96.9 (+0.1) | 98.4 (–0.4) | 97.2 (+1.4) | 91.0 (+5.8) | 95.9 (+1.7) |
Task: Pick up the orange juice and place it in the basket.
Failure: π0
Success: π0 + AimBot
Task: Put the moka pot in the microwave and close it.
Failure: π0
Success: π0 + AimBot
Task: Put the white mug on the plate and put the chocolate pudding to the right of the plate.
Failure: π0
Success: π0 + AimBot
Task: Put the wine bottle on the rack.
Failure: π0
Success: π0 + AimBot
Task: Put the yellow and white mug in the microwave and close it.
Failure: π0
Success: π0 + AimBot
Task: Turn on the stove and put the moka pot on it.
Failure: π0
Success: π0 + AimBot
Failure Cases
Task Goal: Put the bread in the toaster.
Failure: π0 + AimBot
Task Goal: Put the eggs into the egg carton.
Failure: π0 + AimBot
Task Goal: Put the cup on the coffee machine.
Failure: π0 + AimBot
AimBot: A Simple Auxiliary Visual Cue to Enhance Spatial Awareness of Visuomotor Policies