HOME
ABOUT
- RESULTS
- differences
- BENEFITS
- HISTORY
- TEAM
- LOCATION
- FACILITIES
- BANKING
- MEMBERSHIPS
- APPROVALS
- LICENCES
- SUPPLIERS
- SPONSORSHIPS
- MEDIA
- PRIVACY
AUCTIONS
SHIPPING
FEES
- TS REWARDS
TOOLS
guides
FAQ
CONTACT
- CONNECT

VEHICLES
BRAND
- JAPANESE CARS
  - DAIHATSU
  - EUNOS
  - FORD
  - HONDA
  - ISUZU
  - LEXUS
  - MAZDA
  - MITSUBISHI
  - MITSUOKA
  - NISSAN
  - SUBARU
  - SUZUKI
  - TOYOTA
- GERMAN CARS
- AMERICAN CARS
- BRITISH CARS
- ITALIAN CARS
- FRENCH CARS
- SWEDISH CARS
- KOREAN CARS
TYPE
- mobility
- VENDING
- instruction
- TAXIS
- AMBULANCES
- FIRE ENGINES
- HEARSES
- LIMOUSINES
- COMMERCIAL
CLASS
FUEL
TRUCKS
minitrucks
- DAIHATSU
- HONDA
- MAZDA
- MITSUBISHI
- NISSAN
- SUBARU
- SUZUKI
- DUMP
- CRANE
- CAMPER
- REFRIGERATED
- 4WD
- NEW
BUSES
MOTORHOMES
- YAHOO!
- RAKUTEN
- DEALER

PARTS
- FREE REPORT
- PARTS CONTAINERS
- PARTS SYSTEMS
- PARTS PROTECTION
- BODY SHELLS
- DISMANTLING
- ONLINE PARTS
- NEW PARTS
- INTERIOR PARTS
- EXTERIOR PARTS
  - BONNETS
  - BUMPERS
  - GRILLES
  - FENDERS
  - DOORS
  - TRUNKS
  - SPOILERS
  - LIGHTS
  - EMBLEMS
  - CAMERAS
- ENGINES
- TRANSMISSIONS
- WHEELS & TYRES
  - WHEELS
  - TYRES
CUTS
PERFORMANCE PARTS
TRUCK PARTS
MOTORBIKE PARTS
- MOTORBIKE ENGINES
- MOTORBIKE ACCESSORIES

MOTORBIKES
MARINE
FORKLIFTS
MACHINERY
AGRICULTURAL
OTHER
COUNTRY
- AUSTRALIA
- CANADA
- KENYA
- MYANMAR
- NEW ZEALAND
- PAKISTAN
- TANZANIA
- UNITED STATES

CARVIEW

MOTORHOMES

Select Language

HTTP/2 301 server: GitHub.com content-type: text/html location: https://yuffish.github.io/rebot/ access-control-allow-origin: * strict-transport-security: max-age=31556952 expires: Mon, 29 Dec 2025 10:18:10 GMT cache-control: max-age=600 x-proxy-cache: MISS x-github-request-id: B765:123DE:89F7F8:9AF99D:6952530A accept-ranges: bytes date: Mon, 29 Dec 2025 10:08:10 GMT via: 1.1 varnish age: 0 x-served-by: cache-bom-vanm7210029-BOM x-cache: MISS x-cache-hits: 0 x-timer: S1767002890.466995,VS0,VE205 vary: Accept-Encoding x-fastly-request-id: 960499afa990ef76cab782c34e1dc2520f481557 content-length: 162 HTTP/2 200 server: GitHub.com content-type: text/html; charset=utf-8 last-modified: Tue, 23 Dec 2025 04:13:31 GMT access-control-allow-origin: * strict-transport-security: max-age=31556952 etag: W/"694a16eb-5032" expires: Mon, 29 Dec 2025 10:18:10 GMT cache-control: max-age=600 content-encoding: gzip x-proxy-cache: MISS x-github-request-id: 4465:15317B:8A66DE:9B68B7:6952530A accept-ranges: bytes age: 0 date: Mon, 29 Dec 2025 10:08:10 GMT via: 1.1 varnish x-served-by: cache-bom-vanm7210029-BOM x-cache: MISS x-cache-hits: 0 x-timer: S1767002891.700951,VS0,VE213 vary: Accept-Encoding x-fastly-request-id: 37fcac366be5745c9a4f46e34827d9b9f12989d3 content-length: 4487 ReBot: Scaling Robot Learning with Real-to-Sim-to-Real Robotic Video Synthesis

ReBot: Scaling Robot Learning with
Real-to-Sim-to-Real Robotic Video Synthesis

Yu Fang¹ Yue Yang¹ Xinghao Zhu² Kaiyuan Zheng³ Gedas Bertasius¹ Daniel Szafir¹ Mingyu Ding¹

¹ UNC Chapel Hill

² Robotics and AI Institute

³ University of Washington

IROS 2025

TLDR; We propose a novel real-to-sim-to-real approach for scaling real robot datasets and adapting VLA models to target domains.

Abstract

Vision-language-action (VLA) models present a promising paradigm by training policies directly on real robot datasets like Open X-Embodiment. However, the high cost of real-world data collection hinders further data scaling, thereby restricting the generalizability of VLAs.

In this paper, we introduce ReBot, a novel real-to-sim-to-real approach for scaling real robot datasets and adapting VLA models to target domains, which is the last-mile deployment challenge in robot manipulation. Specifically, ReBot replays real-world robot trajectories in simulation to diversify manipulated objects (real-to-sim), and integrates the simulated movements with inpainted real-world background to synthesize physically realistic and temporally consistent robot videos (sim-to-real).

Our approach has several advantages:

It enjoys the benefit of real data to minimize the sim-to-real gap.
It leverages the scalability of simulation.
It can generalize a pretrained VLA to a target domain with fully automated data pipelines.

Extensive experiments in both simulation and real-world environments show that ReBot significantly enhances the performance and robustness of VLAs. For example, in SimplerEnv with the WidowX robot, ReBot improved the in-domain performance of Octo by 7.2% and OpenVLA by 21.8%, and out-of-domain generalization by 19.9% and 9.4%, respectively. For real-world evaluation with a Franka robot, ReBot increased the success rates of Octo by 17% and OpenVLA by 20%.

Method

ReBot includes three key components:

Real-to-Sim Trajectory Replay. For each real-world episode, we automatically set up digital twins in a simulation environment, and replay the real-world robot trajectory to obtain simulated movements for manipulating new objects. We validate the scalability of our approach by demonstrating that real-world trajectories can be successfully reused to manipulate different shapes of objects in simulation.
Real-world Background Inpainting. To obtain task-agnostic real-world background for video synthesis, we introduce an automated inpainting module with GroundedSAM2 to segment and track the robot and object (i.e., task-specific elements) in original real-world videos, and remove them with ProPainter.
Sim-to-Real Video Synthesis. We eventually integrate simulated movements with task-agnostic real-world background, producing synthetic videos with realistic physics and excellent temporal consistency.

Comparison of Synthetic Videos

Example with DROID: redbull can → coke can

Original video

ROSIE

ReBot

Example with BridgeData V2: spatula → spoon

Original video

ROSIE

ReBot

Example with our dataset: grape → carrot

Original video

ROSIE

ReBot

Evaluation in Simulation Environment

In-domain Performance on the WidowX robot

Put spoon on towel

Put carrot on plate

Stack green block on yellow block

Put eggplant in yellow basket

Generalization Performance on the WidowX robot

Physical: unseen object sizes

Spoon size: 0.8x

Spoon size: 1.2x

Carrot size: 0.8x

Carrot size: 1.2x

Semantics: unseen instructions

"Place spoon onto towel"

"Put vegetable on plate"

"Put green cube onto yellow cube"

"Move eggplant into basket"

Subject: unseen objects

Put apple on plate

Put fanta can on towel

Put orange on plate

Put red bull can on plate

Cross-embodiment Performance on Google Robot

Pick coke can (standing)

Pick coke can (horizontal)

Pick coke can (vertical)

Evaluation in Real-world Environment

Put carrot in blue plate

Put grape in yellow plate

Put fanta can in blue plate

Put black cube in yellow plate

Citation

@article{fang2025rebot,
  title={ReBot: Scaling Robot Learning with Real-to-Sim-to-Real Robotic Video Synthesis},
  author={Fang, Yu and Yang, Yue and Zhu, Xinghao and Zheng, Kaiyuan and Bertasius, Gedas and Szafir, Daniel and Ding, Mingyu},
  journal={arXiv preprint arXiv:2503.14526},
  year={2025}
}

HOME
ABOUT
AUCTIONS
SHIPPING
FEES
TOOLS
HOW
FAQ
CONTACT

Original Source | Taken Source