You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This paper proposes a neural rendering approach that represents a scene as "compressed light-field tokens (CLiFTs)", retaining rich appearance and geometric information of a scene.
CLiFT enables compute-efficient rendering by compressed tokens, while being capable of changing the number of tokens to represent a scene or render a novel view with one trained network.
Concretely, given a set of images, 1) multi-view encoder tokenizes the images with the camera poses, 2) latent-space K-means selects a reduced set of rays as cluster centroids using the tokens, 3) the multi-view "condenser" compresses the information of all the tokens into the centroid tokens to construct CLiFTs.
At test time, given a target view and a compute budget (i.e., the number of CLiFTs), the system collects the specified number of nearby tokens and synthesizes a novel view using a compute-adaptive renderer.
Please refer to the data preparation guide to download and prepare for the RealEstate10K dataset.
Getting started
Please follow the test guide for model inference, evaluation, and visualization.
Please follow the training guide for details about the training pipeline.
Citation
If you find CLiFT useful in your research or applications, please consider citing:
@article{Wang2025CLiFT,
author = {Wang, Zhengqing and Wu, Yuefan and Chen, Jiacheng and Zhang, Fuyang and Furukawa, Yasutaka},
title = {CLiFT: Compressive Light-Field Tokens for Compute Efficient and Adaptive Neural Rendering},
journal = {arXiv preprint arXiv:2507.08776},
year = {2025},
}
Our method is deeply inspired by LVSM and DepthSplat, and benefited from their open-source code. Please consider reading these papers if interested in relevant topics.
License
This project is licensed under GPL, see the license file for details.
About
Code for paper "CLiFT: Compressive Light-Field Tokens for Compute Efficient and Adaptive Neural Rendering" [NeurIPS 2025 (spotlight)]