| CARVIEW |
Select Language
HTTP/2 200
server: GitHub.com
content-type: text/html; charset=utf-8
last-modified: Thu, 09 Oct 2025 04:54:35 GMT
access-control-allow-origin: *
strict-transport-security: max-age=31556952
etag: W/"68e7400b-4ea1"
expires: Sun, 28 Dec 2025 04:28:48 GMT
cache-control: max-age=600
content-encoding: gzip
x-proxy-cache: MISS
x-github-request-id: ED36:2F7ECD:737325:814CD3:6950AFA8
accept-ranges: bytes
age: 0
date: Sun, 28 Dec 2025 04:18:48 GMT
via: 1.1 varnish
x-served-by: cache-bom-vanm7210042-BOM
x-cache: MISS
x-cache-hits: 0
x-timer: S1766895528.210023,VS0,VE203
vary: Accept-Encoding
x-fastly-request-id: 58b085f8cc6f787839e72d121bb6abc560480727
content-length: 3679
ResMimic: From General Motion Tracking to Humanoid Whole-Body Loco-Manipulation via Residual Learning
Robustness Test
ResMimic (Success ✅)
Base Policy (Failure ❌)
ResMimic (Success ✅)
Base Policy (Failure ❌)
Train from Scratch (Failure ❌)
Base Policy + Finetune (Failure ❌)
ResMimic: From General Motion Tracking to Humanoid Whole-Body Loco-Manipulation via Residual Learning
Siheng Zhao1,2§,
Yanjie Ze1,3§,
Yue Wang2,
C. Karen Liu1,3†,
Pieter Abbeel1,4†,
Guanya Shi1,5†,
Rocky Duan1†,
1Amazon FAR (Frontier AI & Robotics)
2University of Southern California
3Stanford University
4UC Berkeley
5Carnegie Mellon University
§Work done while interning at Amazon FAR
†FAR Co-Lead
Abstract
Humanoid whole-body loco-manipulation promises transformative capabilities for daily service and warehouse tasks.
While recent advances in general motion tracking (GMT) have enabled humanoids to reproduce diverse human motions, these policies lack the precision and object awareness required for loco-manipulation.
To this end, we introduce ResMimic, a two-stage residual learning framework for precise and expressive humanoid control from human motion data.
First, a GMT policy, trained on large-scale human-only motion, serves as a task-agnostic base for generating human-like whole-body movements.
An efficient but precise residual policy is then learned to refine the GMT outputs to improve locomotion and incorporate object interaction.
To further facilitate efficient training, we design (i) a point-cloud-based object tracking reward for smoother optimization, (ii) a contact reward that encourages accurate humanoid body-object interactions, and (iii) a curriculum-based virtual object controller to stabilize early training.
We evaluate ResMimic in both simulation and on a real Unitree G1 humanoid.
Results show substantial gains in task success, training efficiency, and robustness over strong baselines.
Method Overview
Expressive Whole-Body Loco-Manipulation
Carry Box onto Back (1x)
Kneel on One Knee & Lift Box (1x)
Heavy Object with Whole-Body Contact
Squat & Lift Box with Whole-Body Contact (1x)
4.5kg
Lift Chair with Whole-Body Contact (1x)
4.5kg
Lift Chair with Whole-Body Contact (1x)
5.5kg
Robustness Test
Lift Chair with Whole-Body Contact (1x)
4.5kg
Lift Chair with Whole-Body Contact (1x)
4.5kg
Lift Chair with Whole-Body Contact (1x)
4.5kg
Lift Chair with Whole-Body Contact (1x)
4.5kg
General Object Interaction
Sit on Chair (1x)
ResMimic (Success ✅)
Sit on Chair (1x)
Sit on Chair (1x)
Base Policy (Failure ❌)
Sit on Chair (1x)
Sit on Chair (1x)
Continuous Execution with MoCap Input
Lift Box with Random Object Initial Pose
Autonomous Consecutive Lift Box
Reactivate Behavior to External Perturbation
Comparison with Baselines
ResMimic (Success ✅)
Base Policy (Failure ❌)
Train from Scratch (Failure ❌)
Base Policy + Finetune (Failure ❌)
BibTeX
@misc{zhao2025resmimicgeneralmotiontracking,
title={ResMimic: From General Motion Tracking to Humanoid Whole-body Loco-Manipulation via Residual Learning},
author={Siheng Zhao and Yanjie Ze and Yue Wang and C. Karen Liu and Pieter Abbeel and Guanya Shi and Rocky Duan},
year={2025},
eprint={2510.05070},
archivePrefix={arXiv},
primaryClass={cs.RO},
url={https://arxiv.org/abs/2510.05070},
}