| CARVIEW |
VBC: Visual Whole-Body Control
for Legged Loco-Manipulation
- Minghuan Liu *,1,2
- Zixuan Chen *,1,3
- Xuxin Cheng 1
- Yandong Ji 1
- Rizhao Qiu 1
- Ruihan Yang 1
- Xiaolong Wang 1
- 1UC San Diego
- 2Shanghai Jiao Tong University
- 3Fudan University
- *Equal contribution.
- CoRL 2024 (Oral)
We propose a sim-to-real framework that enables the robot to grasp different objects in varying surroundings.
Abstract
We study the problem of mobile manipulation using legged robots equipped with an arm, namely legged loco-manipulation. The robot legs, while usually utilized for mobility, offer an opportunity to amplify the manipulation capabilities by conducting whole-body control. That is, the robot can control the legs and the arm at the same time to extend its workspace. We propose a framework that can conduct the whole-body control autonomously with visual observations. Our approach, namely Visual Whole-Body Control (VBC), is composed of a low-level policy using all degrees of freedom to track the end-effector manipulator position and a high-level policy proposing the end-effector position based on visual inputs. We train both levels of policies in simulation and perform Sim2Real transfer for real robot deployment. We perform extensive experiments and show significant improvements over baselines in picking up diverse objects in different configurations (heights, locations, orientations) and environments.
Multi-Object Grasping Pipeline
Visual Input Visualization
More Examples
Low-Level and High-Level Policy Training in Simulation
All policies are trained parallelly in simulation. The low-level policy is trained to track the given end-effector pose and body velocities, while the high-level policy is trained to propose the end-effector position based on visual inputs. The low-level policy may fail to track some goals due to the IK fail, but the high-level policy will learn to avoid those pose.
Tracking SAM Annotation Explain
We only assign object masks to be grasped by this tool, after annotation all pickup behaviors are automatically performed.
Representative Failure Cases
The robot may fail due to several reasons, like the failure of the depth perception, the lost of the tracking mask, the inproper design of the Z1 gripper, and also generalization error.
Funny Cases
BibTeX
@article{liu2024visual,
title={Visual Whole-Body Control for Legged Loco-Manipulation},
author={Liu, Minghuan and Chen, Zixuan and Cheng, Xuxin and Ji, Yandong and Qiu, Rizhao and Yang, Ruihan and Wang, Xiaolong},
journal={The 8th Conference on Robot Learning},
year={2024}
}