| CARVIEW |
Virtual Fitting Room:
Generating Arbitrarily Long Videos of Virtual Try-On from a
Single
Image
Virtual Fitting Room: Generating Arbitrarily Long
Videos of Virtual Try-On from a Single Image
| Jun-Kun
Chen
|
Aayush Bansal
|
Minh Phuoc Vo
|
Yu-Xiong Wang
|
Overview
We present Virtual Fitting Room, a video generation framework for creating minute-scale try-on videos from a single user image, a target garment, and a reference motion. Unlike traditional try-on methods that produce short clips or static images, VFR supports arbitrarily long, high-resolution videos with natural user-garment interactions.
Our approach segments long sequences into overlapping clips and enforces local smoothness between adjacent segments while preserving global temporal consistency using a compact anchor video. This design enables stable synthesis over extended motions without relying on long training sequences.
The generated videos exhibit appearance consistency from all viewpoints, robust handling of self-occlusions and body interactions, and preservation of fine details such as accessories. As a by-product, VFR also supports free-viewpoint rendering, enabling novel camera angles from a single input image.
Various motions.
Demo Video
Citation
Acknowledgements
We thank you and the other visitors for visiting our project page.