| CARVIEW |
Select Language
HTTP/2 200
server: GitHub.com
content-type: text/html; charset=utf-8
last-modified: Thu, 30 Oct 2025 22:55:28 GMT
access-control-allow-origin: *
strict-transport-security: max-age=31556952
etag: W/"6903ece0-41f4"
expires: Sun, 28 Dec 2025 08:27:33 GMT
cache-control: max-age=600
content-encoding: gzip
x-proxy-cache: MISS
x-github-request-id: 6064:318CF6:7568CB:83AC36:6950E79C
accept-ranges: bytes
age: 0
date: Sun, 28 Dec 2025 08:17:33 GMT
via: 1.1 varnish
x-served-by: cache-bom-vanm7210050-BOM
x-cache: MISS
x-cache-hits: 0
x-timer: S1766909853.982860,VS0,VE228
vary: Accept-Encoding
x-fastly-request-id: 519d200da0d0b37a7755989a82a634d091ac9c90
content-length: 4383
3DSR
Bridging Diffusion Models and 3D Representations:
Bridging Diffusion Models and 3D Representations:
A 3D Consistent Super-Resolution Framework
1University of Maryland College Park
2Carnegie Mellon University
3University of Illinois at Urbana-Champaign
ICCV 2025
Abstract
We propose 3D Super Resolution (3DSR), a novel 3D Gaussian-splatting-based super-resolution framework that leverages off-the-shelf diffusion-based 2D super-resolution models.
3DSR encourages 3D consistency across views via the use of an explicit 3D Gaussian-splatting-based scene representation.
This makes the proposed 3DSR different from prior work such as image upsampling or the use of video super-resolution, which either don't consider 3D consistency or aim to incorporate 3D consistency implicitly.
Notably, our method enhances visual quality without additional fine-tuning, ensuring spatial coherence within the reconstructed scene.
We evaluate 3DSR on MipNeRF360 and LLFF data, demonstrating that it produces high-resolution results that are visually compelling, while maintaining structural consistency in 3D reconstructions.
Method Overview
Given a latent x at timestep t, we first estimate the clean image x0 using the denoiser of a pre-trained diffusion model. To enforce 3D consistency, we leverage 3D Gaussian Splatting (3DGS) by treating the predicted clean images as training data. The 3DGS model renders high-resolution, view-consistent images, which are then re-encoded into the latent space to estimate the latent of the next denoising step.
Interactive video demo
Applications
Scene
- trex
- flower
- leaves
- horns
- room
- bicycle
- bonsai
- garden
- kitchen
- stump
- treehill
- bicycle
- bonsai
- garden
- kitchen
- stump
- treehill
Method
BibTeX
@inproceedings{chen2025bridging,
author = {Chen, Yi-Ting and Liao, Ting-Hsuan and Guo, Pengsheng and Schwing, Alexander and Huang, Jia-Bin},
title = {Bridging Diffusion Models and 3D Representations: A 3D Consistent Super-Resolution Framework},
journal = {Proceedings of the International Conference on Computer Vision Conference},
year = {2025}
}