| CARVIEW |
We propose novel metrics, tasks, visualization tools and methods for “policy mobilization,” the problem of taking a non-mobile manipulation policy and finding a proper initial robot pose from which to execute it on a mobile platform.
Introducing
Policy Mobilization
Given an existing manipulation policy trained on data collected from limited viewpoints, policy mobilization aims to find an optimal robot pose in an unseen environment to successfully execute this policy.
Highlights
What Policy Mobilization Enables
Chaining a sequence of manipulation skills in-the-wild.
Generalizing zero-shot to unseen scene layouts.
Operating in large spaces that the policies have not explored during training.
Adapting to novel object heights.
How we approach policy mobilization
Our Recipe, Simple Steps
Our method for policy mobilization follows four simple steps. Provide a pre-trained robot manipulation policy with its accompanying training data. The method handles the rest.
Capture
Drive the robot around to capture the test-time scene.
Splat
Build a 3D Gaussian Splatting model from the capture.
Score
Use our novel hybrid score function to rate robot poses.
Optimize
Find the best robot pose with sampling-based optimization.
Comparisons
Real Robot Results
In real experiments, we compare our method with two baselines: BC w/ Nav and a Human baseline. In the BC w/ Nav baseline, we train an end-to-end imitation learning policy from navigation and manipulation data. In the Human baseline, we verbally ask users to manually drive the robot to where they believe is the optimal starting base pose for it to execute each manipulation task successfully, without providing any information on how the policy was trained. All videos below are played at 4x playback speed.
Introducing
The Mobi-π Framework
To motivate further research on policy mobilization, we propose the Mobi-π framework:
5
Simulated tasks based on RoboCasa
3
Navigation and imitation learning baselines
2
Mobilization Feasibility Metrics
Simulation Task Suite
To effectively study the policy mobilization problem, we develop a suite of sim environments with RoboCasa as a benchmarking setup. We pick five single-stage manipulation tasks: Close Door, Close Drawer, Turn on Faucet, Turn on Microwave, and Turn on Stove.
Baselines
We study three baselines for policy mobilization. We divide them into two categories. The first type of baselines navigates to the object of interest without considering the manipulation policy's capabilities. These methods are not policy-aware. Methods in this category include LeLaN and VLFM. The second type of baselines is policy-aware and leverages large-scale data to connect navigation with manipulation. We use BC w/ Nav, a Behavior Transformer trained to jointly perform navigation and manipulation using combined demonstrations, as a representative method in this category.
Metrics
We quantify how feasible a policy can be mobilized from spatial and visual perspectives. We evaluate the feasibility metrics for the simulated tasks and discuss their correlations with experimental results.
Team
Meet Our Team
This work would not be possible without the awesome team.
Jingyun completed this work during an internship with Toyota Research Institute.
* Isabella and Brandon contributed equally.
FAQ
Questions? Answers.
Why should I care about policy mobilization? Why not simply train a mobile manipulation policy from a large dataset?
Indeed, one approach to learning mobile manipulation is to train a policy from a dataset that includes both navigation and manipulation data. However, training such a policy typically requires large amounts of training data, since the policy needs to not only generalize to different navigation and manipulation scenarios but also seamlessly coordinate navigation and manipulation. We showed in our simulation experiments that an imitation learning baseline that learns an end-to-end policy for navigation and manipulation fails to perform well in unseen room layouts despite learning from 5x more training data. Compared to training an end-to-end mobile manipulation policy, our policy mobilization framework provides a more data-efficient approach to learning mobile manipulation. Meanwhile, our problem formulation complements existing efforts to improve the robustness of manipulation policies and remains compatible with them.
Why not train a mobile manipulation policy in simulation and transfer to the real world?
Training a mobile manipulation policy in simulation requires creating an interactive and physically accurate simulation. This is still an unsolved problem for many tasks, like those with deformable and granular objects.
How does the approach handle distractors?
Our sim setup includes unseen test objects. We find that our method, which utilizes DINO dense descriptors to score robot poses, is capable of ignoring these irrelevant objects.
Can the method handle imperfect scene reconstruction?
Yes. Our learned 3DGS models have artifacts and surface color inconsistencies, yet the method still performs well.
I have other questions. How can I get an answer?
Please try to first find answers in our paper and supplementary materials. If you still cannot find an answer, you are welcome to contact us via email.
Found our work useful?
Cite Us
@article{yang2025mobipi,
title={Mobi-$\pi$: Mobilizing Your Robot Learning Policy},
author={Yang, Jingyun and Huang, Isabella and Vu, Brandon and Bajracharya, Max and Antonova, Rika and Bohg, Jeannette},
journal={arXiv preprint arXiv:2505.23692},
year={2025}
}
Want more reading?
Related Research
Generalizable and data-efficient policy learning method powered by equivariant architectures, deployed on a mobile platform.
Combining hybrid IL agent and whole-body controller enables sample-efficient and generalizable mobile manipulation.
Reset-free finetuning system that autonomously improves a pre-trained multi-task manipulation policy without human intervention.