| CARVIEW |
Method
Our novel method facilitates the decompositional neural reconstruction with generative diffusion prior. By leveraging the generative prior, we optimize both the geometry and appearance of each object alongside the reconstruction loss, effectively filling in missing information in unobserved and occluded regions. Furthermore, we propose a visibility-guided approach to dynamically adjust the SDS loss, alleviating the conflict between the reconstruction objective and generative prior guidance.
Results
Decompositional Scene Reconstruction
Examples from Replica and ScanNet++ demonstrate our model produces higher quality reconstructions compared with the baselines. Our results achieve more accurate reconstruction in less captured areas, more precise object structures, smoother background reconstruction, and fewer floating artifacts.
Novel View Synthesis
Our appearance prior also provides reasonable additional information in sparsely captured regions, leading to higher-quality rendering in these areas compared to the artifacts observed in the baseline results.
Scene Editing
Our method enables seamless text-based editing for geometry and appearance through SDS optimization and produces decomposed object meshes with detailed UV maps that support photorealistic VFX editing.
By transforming all objects in the scene into a new, unified style, we can easily generate a new scene with the same layout but different appearance, which could be beneficial for the Metaverse.
Generalization Capability on YouTube Videos
Our model exhibits strong generalization to in-the-wild scenes, such as YouTube videos, achieving high-quality reconstructions with detailed geometry and appearance using only 15 input views. link to scene1, link to scene2 and link to scene3.
Related Work
MonoSDF: Exploring Monocular Geometric Cues for Neural Implicit Surface ReconstructionRICO: Regularizing the Unobservable for Indoor Compositional Reconstruction
ObjectSDF++: Improved Object-Compositional Neural Implicit Surfaces
PhyRecon: Physically Plausible Neural Scene Reconstruction
Fantasia3D: Disentangling Geometry and Appearance for High-quality Text-to-3D Content Creation
RichDreamer: A Generalizable Normal-Depth Diffusion Model for Detail Richness in Text-to-3D
BibTeX
@inproceedings{ni2025dprecon,
title={Decompositional Neural Scene Reconstruction with Generative Diffusion Prior},
author={Ni, Junfeng and Liu, Yu and Lu, Ruijie and Zhou, Zirui and Zhu, Song-Chun and Chen, Yixin and Huang, Siyuan},
booktitle=CVPR,
year={2025}
}