| CARVIEW |
MixRT: Mixed Neural Representations
For Real-Time NeRF Rendering
Abstract
Neural Radiance Field (NeRF) has emerged as a leading technique for novel view synthesis, owing to its impressive photorealistic reconstruction and rendering capability. Nevertheless, achieving real-time NeRF rendering in large-scale scenes has presented challenges, often leading to the adoption of either intricate baked mesh representations with a substantial number of triangles or resource-intensive ray marching in baked representations. We challenge these conventions, observing that high-quality geometry that is represented by meshes with substantial triangles, is not necessary for achieving photorealistic rendering quality. Consequently, we propose MixRT, a novel NeRF representation, that includes a low-quality mesh, a view-dependent-displacement map, and a compressed NeRF model. This design effectively harnesses the capabilities of existing graphics hardware, thus enabling real-time NeRF rendering on edge devices. Leveraging a highly-optimized WebGL-based rendering framework, our proposed MixRT attains real-time rendering speeds on edge devices (>30 FPS at a resolution of 1280 x 720 on a Macbook M1 Pro laptop), betster rendering quality (0.2 PSNR higher on indoor scenes of the Unbounded-360 datasets), and smaller storage (80%) compared to SotA methods.
Proposed MixRT Presentations
An overview of our proposed MixRT rendering pipeline: MixRT integrates three core components:
a low-quality mesh, a view-dependent displacement map, and a NeRF model compressed into a hash table.
This combination aims to maximize utilization of diverse hardware resources.
To render an image pixel: (1) We use rasterizer hardware to perform mesh rasterization, determining the ray-mesh intersection point, $p$.
(2) Leveraging texture mapping units, we use texture coordinates to access maps containing the spherical harmonics (SH) coefficients
and scale, computing the calibrated point, $p_{cali}$.
(3) Lastly, $p_{cali}$ is processed by SIMD units, retrieving embeddings for its eight closest vertices from the 3D grid stored as a hash table.
A small MLP network then converts these interpolated embeddings into the final rendered color.
Real-Time Interactive Viewer Demos
Collision Animation
stump
kitchenlego
officebonsai
kitchencounter
bicycle
gardenvase
fulllivingroom
Acknowledgements
The website template was borrowed from Instant Neural Graphics Primitives.