Carview!

CLiFT: Compressive Light-Field Tokens for Compute-Efficient and Adaptive Neural Rendering

Zhengqing Wang¹ , Yuefan Wu¹ , Jiacheng Chen¹ , Fuyang Zhang¹ , Yasutaka Furukawa^1,2

¹ Simon Fraser University ² Wayve

(arXiv, Project page)

demo_with_audio.mp4

Checklists

Release the training and inference on RE10K
Add more detailed docs
Release the training and inference on DL3DV

Introduction

This paper proposes a neural rendering approach that represents a scene as "compressed light-field tokens (CLiFTs)", retaining rich appearance and geometric information of a scene.

CLiFT enables compute-efficient rendering by compressed tokens, while being capable of changing the number of tokens to represent a scene or render a novel view with one trained network.
Concretely, given a set of images, 1) multi-view encoder tokenizes the images with the camera poses, 2) latent-space K-means selects a reduced set of rays as cluster centroids using the tokens, 3) the multi-view "condenser" compresses the information of all the tokens into the centroid tokens to construct CLiFTs.
At test time, given a target view and a compute budget (i.e., the number of CLiFTs), the system collects the specified number of nearby tokens and synthesizes a novel view using a compute-adaptive renderer.

Installation

Please refer to the installation guide to set up the environment.

Data preparation

Please refer to the data preparation guide to download and prepare for the RealEstate10K dataset.

Getting started

Please follow the test guide for model inference, evaluation, and visualization.

Please follow the training guide for details about the training pipeline.

Citation

If you find CLiFT useful in your research or applications, please consider citing:

@article{Wang2025CLiFT,
  author    = {Wang, Zhengqing and Wu, Yuefan and Chen, Jiacheng and Zhang, Fuyang and Furukawa, Yasutaka},
  title     = {CLiFT: Compressive Light-Field Tokens for Compute Efficient and Adaptive Neural Rendering},
  journal   = {arXiv preprint arXiv:2507.08776},
  year      = {2025},
}

Our method is deeply inspired by LVSM and DepthSplat, and benefited from their open-source code. Please consider reading these papers if interested in relevant topics.

License

This project is licensed under GPL, see the license file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
assets		assets
config		config
docs		docs
script		script
src		src
tools		tools
.gitignore		.gitignore
LICENSE		LICENSE
LICENSE_GPL		LICENSE_GPL
README.md		README.md
eval.py		eval.py
finetune.py		finetune.py
requirements.txt		requirements.txt
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Licenses found

Uh oh!

Repository files navigation

CLiFT: Compressive Light-Field Tokens for Compute-Efficient and Adaptive Neural Rendering

Checklists

Table of Content

Introduction

Installation

Data preparation

Getting started

Citation

License

About

Licenses found

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

License

Licenses found

eric-zqwang/CLiFT

Folders and files

Latest commit

History

Repository files navigation

CLiFT: Compressive Light-Field Tokens for Compute-Efficient and Adaptive Neural Rendering

Checklists

Table of Content

Introduction

Installation

Data preparation

Getting started

Citation

License

About

Topics

Resources

License

Licenses found

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages