CARVIEW

MOTORHOMES

Select Language

HTTP/2 301 server: GitHub.com content-type: text/html location: https://hanshengchen.com/ssdnerf x-github-request-id: 226D:3157C7:9B7B52:AECC33:69538671 accept-ranges: bytes age: 0 date: Tue, 30 Dec 2025 07:59:45 GMT via: 1.1 varnish x-served-by: cache-bom-vanm7210061-BOM x-cache: MISS x-cache-hits: 0 x-timer: S1767081585.255467,VS0,VE198 vary: Accept-Encoding x-fastly-request-id: 9f1cf51ce183e1f0508994fe52e6af2d12e023d3 content-length: 162 HTTP/2 301 server: GitHub.com content-type: text/html location: https://hanshengchen.com/ssdnerf/ access-control-allow-origin: * expires: Tue, 30 Dec 2025 08:09:45 GMT cache-control: max-age=600 x-proxy-cache: MISS x-github-request-id: 4712:3827E5:9DA5E5:B10B8D:69538671 accept-ranges: bytes age: 0 date: Tue, 30 Dec 2025 07:59:45 GMT via: 1.1 varnish x-served-by: cache-bom-vanm7210068-BOM x-cache: MISS x-cache-hits: 0 x-timer: S1767081586.623899,VS0,VE213 vary: Accept-Encoding x-fastly-request-id: 8b2cf2131430972f9d984b47ee517a67410bc0cc content-length: 162 HTTP/2 200 server: GitHub.com content-type: text/html; charset=utf-8 last-modified: Tue, 23 Dec 2025 02:50:45 GMT access-control-allow-origin: * etag: W/"694a0385-39b5" expires: Tue, 30 Dec 2025 08:09:45 GMT cache-control: max-age=600 content-encoding: gzip x-proxy-cache: MISS x-github-request-id: 55D0:2BC55:9CDAD3:B02C80:69538670 accept-ranges: bytes age: 0 date: Tue, 30 Dec 2025 07:59:46 GMT via: 1.1 varnish x-served-by: cache-bom-vanm7210068-BOM x-cache: MISS x-cache-hits: 0 x-timer: S1767081586.866090,VS0,VE208 vary: Accept-Encoding x-fastly-request-id: f5a35832723a8bb6df80322b26118a2321070c20 content-length: 3709 Single-Stage Diffusion NeRF: A Unified Approach to 3D Generation and Reconstruction

Single-Stage Diffusion NeRF: A Unified Approach to 3D Generation and Reconstruction

ICCV 2023

Hansheng Chen^1,*, Jiatao Gu², Anpei Chen³, Wei Tian¹, Zhuowen Tu⁴, Lingjie Liu⁵, Hao Su⁴

¹Tongji University ²Apple ³ETH Zürich ⁴UCSD ⁵University of Pennsylvania
*Work done during a remote internship with UCSD

Paper Code arXiv

TLDR: Trains 3D diffusion and NeRF jointly in a single stage
Comparable to/better than SoTAs on 3D generation and reconstruction

Abstract

3D-aware image synthesis encompasses a variety of tasks, such as scene generation and novel view synthesis from images. Despite numerous task-specific methods, developing a comprehensive model remains challenging. In this paper, we present SSDNeRF, a unified approach that employs an expressive diffusion model to learn a generalizable prior of neural radiance fields (NeRF) from multi-view images of diverse objects. Previous studies have used two-stage approaches that rely on pretrained NeRFs as real data to train diffusion models. In contrast, we propose a new single-stage training paradigm with an end-to-end objective that jointly optimizes a NeRF auto-decoder and a latent diffusion model, enabling simultaneous 3D reconstruction and prior learning, even from sparsely available views. At test time, we can directly sample the diffusion prior for unconditional generation, or combine it with arbitrary observations of unseen objects for NeRF reconstruction. SSDNeRF demonstrates robust results comparable to or better than leading task-specific methods in unconditional generation and single/sparse-view 3D reconstruction.

During training, SSDNeRF jointly learns triplane features of individual scenes, a shared NeRF decoder, and a triplane diffusion prior. During testing, it can perform (a) unconditional generation, (b) single-view reconstruction, as well as multi-view reconstruction.

An overview of SSDNeRF framework with a triplane NeRF representation. During training, we feed a batch of observations in the format of pixel RGBs and rays. The corresponding scene code is randomly initialized and optimized by minimizing both the rendering loss and the diffusion loss, and model parameters ϕ, ψ are also updated along the way.

Unconditional Generation

Trained on ABO Tables

Trained on ShapeNet-SRN Cars

Single-View Reconstruction

Inputs from ShapeNet-SRN Cars test set

Inputs from ShapeNet-SRN Chairs test set

[Sim-to-real] Inputs from KITTI real images (trained on SRN Cars)

Sparse-to-Dense Reconstruction

Novel view synthesis quality (LPIPS) vs. number of input views, evaluated on SRN Cars.

BibTeX

@inproceedings{ssdnerf,
    title={Single-Stage Diffusion NeRF: A Unified Approach to 3D Generation and Reconstruction}, 
    author={Hansheng Chen and Jiatao Gu and Anpei Chen and Wei Tian and Zhuowen Tu and Lingjie Liu and Hao Su},
    year={2023},
    booktitle={ICCV}
}

Original Source | Taken Source