Carview!

CARVIEW

MOTORHOMES

Select Language

HTTP/2 301 server: GitHub.com content-type: text/html location: https://daveredrum.github.io/ScanRefer/ x-github-request-id: 2FFC:2C10E1:95EB66:A82E9F:6952F92D accept-ranges: bytes age: 0 date: Mon, 29 Dec 2025 21:57:02 GMT via: 1.1 varnish x-served-by: cache-bom-vanm7210067-BOM x-cache: MISS x-cache-hits: 0 x-timer: S1767045422.345386,VS0,VE199 vary: Accept-Encoding x-fastly-request-id: 315415abdc9b1b73d0d1d3d88e8ae34c2e8052f6 content-length: 162 HTTP/2 200 server: GitHub.com content-type: text/html; charset=utf-8 last-modified: Fri, 10 Feb 2023 12:26:04 GMT access-control-allow-origin: * strict-transport-security: max-age=31556952 etag: W/"63e637dc-15b9" expires: Mon, 29 Dec 2025 22:07:02 GMT cache-control: max-age=600 content-encoding: gzip x-proxy-cache: MISS x-github-request-id: B8C6:3ABDEF:949283:A6D60B:6952F92E accept-ranges: bytes age: 0 date: Mon, 29 Dec 2025 21:57:02 GMT via: 1.1 varnish x-served-by: cache-bom-vanm7210067-BOM x-cache: MISS x-cache-hits: 0 x-timer: S1767045423.558490,VS0,VE211 vary: Accept-Encoding x-fastly-request-id: e9fcb73f08f2c994fd9d714c65cbb57fb6f0ea79 content-length: 2232 ScanRefer: 3D Object Localization in RGB-DScans using Natural Language

ScanRefer: 3D Object Localization in RGB-D Scans using Natural Language

European Conference on Computer Vision (ECCV), 2020.

Dave Zhenyu Chen¹ Angel X. Chang² Matthias Nießner¹

¹Technical University of Munich ²Simon Fraser University

Submit to our ScanRefer Localization Benchmark here!

Introduction

We introduce the task of 3D object localization in RGB-D scans using natural language descriptions. As input, we assume a point cloud of a scanned 3D scene along with a free-form description of a specified target object. To address this task, we propose ScanRefer, learning a fused descriptor from 3D object proposals and encoded sentence embeddings. This fused descriptor correlates language expressions with geometric features, enabling regression of the 3D bounding box of a target object. We also introduce the ScanRefer dataset, containing 51,583 descriptions of 11,046 objects from 800 ScanNet scenes. ScanRefer is the first large-scale effort to perform object localization via natural language expression directly in 3D.

Video

Browse

The ScanRefer data can be browsed online in your web browser. Learn more at the ScanRefer Data Browser.
(For a better browsing experience, we recommend using Google Chrome.)

Publication

European Conference on Computer Vision (ECCV), 2020.
Paper | arXiv | Code

If you find our project useful, please consider citing us:

@article{chen2020scanrefer,
    title={ScanRefer: 3D Object Localization in RGB-D Scans using Natural Language},
    author={Chen, Dave Zhenyu and Chang, Angel X and Nie{\ss}ner, Matthias},
    journal={16th European Conference on Computer Vision (ECCV)},
    year={2020}
}

Dataset Download

If you would like to access to the ScanRefer dataset, please fill out the ScanRefer Terms of Use Form. Once your request is accepted, you will receive an email with the download link.

Original Source | Taken Source