CARVIEW

MOTORHOMES

Select Language

HTTP/2 301 server: GitHub.com content-type: text/html location: https://cypypccpy.github.io/obj-dex.github.io/ x-github-request-id: CFC8:36A0B4:7E1404:8D585A:69516BF1 accept-ranges: bytes age: 0 date: Sun, 28 Dec 2025 17:42:09 GMT via: 1.1 varnish x-served-by: cache-bom-vanm7210051-BOM x-cache: MISS x-cache-hits: 0 x-timer: S1766943729.114415,VS0,VE206 vary: Accept-Encoding x-fastly-request-id: 82c85e7753e5839c95c3f0c8e24cd87742952605 content-length: 162 HTTP/2 200 server: GitHub.com content-type: text/html; charset=utf-8 last-modified: Fri, 28 Feb 2025 06:00:14 GMT access-control-allow-origin: * strict-transport-security: max-age=31556952 etag: W/"67c150ee-5aed" expires: Sun, 28 Dec 2025 17:52:09 GMT cache-control: max-age=600 content-encoding: gzip x-proxy-cache: MISS x-github-request-id: 4A54:2680BD:7D6147:8CA459:69516BEF accept-ranges: bytes age: 0 date: Sun, 28 Dec 2025 17:42:09 GMT via: 1.1 varnish x-served-by: cache-bom-vanm7210051-BOM x-cache: MISS x-cache-hits: 0 x-timer: S1766943729.333895,VS0,VE260 vary: Accept-Encoding x-fastly-request-id: f3d8f2ad683a72c13fd532efaf2adcbbb1546f9f content-length: 5154 Object-Centric Dexterous Manipulation from Human Motion Data

Object-Centric Dexterous Manipulation from Human Motion Data

CoRL 2024

Yuanpei Chen^{1, 2}, Chen Wang¹, Yaodong Yang², C. Karen Liu¹

¹Stanford University, ²Peking University

Paper ArXiv Code

We introduce a hierarchical framework that uses human hand motion data and deep reinforcement learning to train dexterous robot hands for effective object-centric manipulation in both simulation and real world.

Despite being trained only in simulation, our system demonstrates zero-shot transfer to two real-world robots equipped with the dexterous hand.

Abstract

Manipulating objects to achieve desired goal states is a basic but important skill for dexterous manipulation. Human hand motions demonstrate proficient manipulation capability, providing valuable data for training robots with multi-finger hands. Despite this potential, substantial challenges arise due to the embodiment gap between human and robot hands. In this work, we introduce a hierarchical policy learning framework that uses human hand motion data for training object-centric dexterous robot manipulation. At the core of our method is a high-level trajectory generative model, learned with a large-scale human hand motion capture dataset, to synthesize human-like wrist motions conditioned on the desired object goal states. Guided by the generated wrist motions, deep reinforcement learning is further used to train a low-level finger controller that is grounded in the robot's embodiment to physically interact with the object to achieve the goal. Through extensive evaluation across 10 household objects, our approach not only demonstrates superior performance but also showcases generalization capability to novel object geometries and goal states. Furthermore, we transfer the learned policies from simulation to a real-world bimanual dexterous robot system, further demonstrating its applicability in real-world scenarios.

Method

(A) Training: Firstly, we use human motion capture data to train a generation model to synthesize dual hand trajectory conditions on object trajectory. Then we use the RL to train a low-level robot controller conditioned on the dual hand trajectory generated by the trained high-level planner. During this process we augment the data in simulation to improve the high-level planner and low-level controller simultaneously. (B) Inference: Given a single object goal trajectory, our framework generates dual hand reference trajectory and guides the low-level controller to accomplish the task.

Experiments

Environment Setups

Overview of the environment setups:
(a). Workspace of the simulation. We employ two Shadow Hands, each individually mounted on separate UR10e robots, arranged in an abreast configuration.
(b). Object sets in the simulation and the real-world.
(c). Workspace of the real-world, mirroring the simulation, the robot system uses the same Shadow Hands and UR10e robots as the simulation.

Import Human Motion Capture Data to Simulation

The upper left video is the object goal trajectory input, and the upper right video is the high-level planner output (wrist motion generation).

Optimizing Human Mocap Data via Reinforcement Learning

The upper left video is the object goal trajectory input, and the upper right video is the high-level planner output (wrist motion generation).

Learning Object-Centric Dexterous Manipulation in Isaac-Gym

Upper Left Video: Object goal trajectories from Human Mocap Data (ARCTIC Dataset), which are the input of our policy.
Upper Right Video: High-level planner output (wrist motion generation). Object motion replay for visualization.
Lower Video: Low-level policy output (finger + wrist motion). Fully autonomous results, no object motion replay.