| CARVIEW |
Select Language
HTTP/2 200
server: GitHub.com
content-type: text/html; charset=utf-8
last-modified: Thu, 13 Jun 2024 07:26:52 GMT
access-control-allow-origin: *
strict-transport-security: max-age=31556952
etag: W/"666a9f3c-43ca"
expires: Sun, 28 Dec 2025 08:15:05 GMT
cache-control: max-age=600
content-encoding: gzip
x-proxy-cache: MISS
x-github-request-id: 46A2:3827E5:76E44C:853816:6950E4B1
accept-ranges: bytes
age: 0
date: Sun, 28 Dec 2025 08:05:05 GMT
via: 1.1 varnish
x-served-by: cache-bom-vanm7210023-BOM
x-cache: MISS
x-cache-hits: 0
x-timer: S1766909106.556384,VS0,VE221
vary: Accept-Encoding
x-fastly-request-id: f4f6a5b91eeb0380c334c76453952c3a9c7d9d38
content-length: 3931
View-Invariant Policy Learning via Zero-Shot Novel View Synthesis
View-Invariant Policy Learning
Qualitative Behavior of
Here we show rollouts of policies trained with view synthesis model augmentation when placed in random test viewpoints on the
Using the ZeroNVS novel view synthesis model, we learn diffusion policies that take as input both third-person and (unaugmented) wrist camera observations to solve a "put cup in saucer" task from multiple novel viewpoints. Here we show successful rollouts from policies trained on datasets augmented by a pretrained ZeroNVS model and a ZeroNVS model finetuned on DROID data. Again, the original training dataset only contains trajectories observed from a single RGB view.
View-Invariant Policy Learning
via Zero-Shot Novel View Synthesis
Anonymous Authors
To supplement our submission, on this page we provide visualizations of augmented dataset trajectories, policy rollouts from novel test viewpoints in real and simulated environments, comparisons between view synthesis methods, and a brief overview of our work.
Qualitative Behavior of
Learned Policies on Novel Test Viewpoints
Here we show rollouts of policies trained with view synthesis model augmentation when placed in random test viewpoints on the quarter circle arc distribution. We note that even when the policies augmented using the finetuned ZeroNVS model fails, they tend to make more progress on the task compared to single view or depth estimation and reprojection baselines. Critically, note that these policies are all trained using a single-view source demonstration dataset.
Hammer
Coffee
Stack
Single original view
Depth est. + Reproj.
ZeroNVS (Finetuned)
