| CARVIEW |
Select Language
TL;DR: A single-feed-forward method that models unseen 3D geometry using layered point maps.
Summary
- We model scene geometry using layered point maps, where each layer represents the intersection points through which rays pass on a surface.
- We unify object- and scene-level geometric reasoning into a standard 2D regression task, achieving competitive or improved performance in a single feed-forward pass.
- We provide a dataset construction, cleaning, and evaluation benchmark for this task.
Overview
Left: multiple ray-surface intersections + point maps ⇒ Unseen geometry representation.
Right: LaRI map \(\mathbf{V} \in \mathbb{R}^{H\times W \times L \times 3}\) + valid ray intersection mask \(\mathbf{M} \in \{0,1\}^{H\times W \times L}\) ⇒ Final 3D point cloud.
Layered Depth Visualization
Layered Depth: 1
Object-level 3D Interactive Results
All models are subsampled to 20K points. Unseen geometry is higlighted with random color.
Scene-level 3D Interactive Results
All models are subsampled to 20K points. Unseen geometry is higlighted with random color.
3D Reconstruction & Reasoning Evaluation
In object-level comparison, LaRI yields comparable or better performances than existing large generative models. In scene-level comparison, LaRI achieves the best performance in unseen geometry estimation.
Computaion Efficiency
LaRI adopts a relaively lightweight solution with faster speed.
BibTeX
@inproceedings{li2025lari,
title={LaRI: Layered Ray Intersections for Single-view 3D Geometric Reasoning},
author={Li, Rui and Zhang, Biao and Li, Zhenyu and Tombari, Federico and Wonka, Peter},
booktitle={arXiv preprint arXiv:2504.18424},
year={2025}
}
|
|