Exporters From Japan
Wholesale exporters from Japan   Company Established 1983
CARVIEW
Select Language

Method

MultiDiff leverages strong depth and video diffusion priors to enable consistent novel view synthesis of scenes from a single RGB image using a novel correspondence attention layer.

Novel-view rendering results following the GT trajectory.

By warping the initial noise according to the estimated depth into the target novel views, we can structure the noise providing additional information about the 3D scene structure. Just like Neo in "The Matrix", the model can decode this abstract noise pattern in more consistent views.

By masking areas in the input image, MultiDiff naturally enables consistent editing without the need for finetuning.

BibTeX

@InProceedings{Muller_2024_CVPR,
                author    = {M\"uller, Norman and Schwarz, Katja and R\"ossle, Barbara and Porzi, Lorenzo and Bul\`o, Samuel Rota and Nie{\ss}ner, Matthias and Kontschieder, Peter},
                title     = {MultiDiff: Consistent Novel View Synthesis from a Single Image},
                booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
                month     = {June},
                year      = {2024},
                pages     = {10258-10268}
            }