CARVIEW

MOTORHOMES

Select Language

HTTP/2 301 server: GitHub.com content-type: text/html location: https://crockwell.github.io/rel_pose/ access-control-allow-origin: * expires: Mon, 29 Dec 2025 08:38:52 GMT cache-control: max-age=600 x-proxy-cache: MISS x-github-request-id: 3480:444BC:8910B7:99E523:69523BC3 accept-ranges: bytes age: 0 date: Mon, 29 Dec 2025 08:28:52 GMT via: 1.1 varnish x-served-by: cache-bom-vanm7210024-BOM x-cache: MISS x-cache-hits: 0 x-timer: S1766996932.428309,VS0,VE207 vary: Accept-Encoding x-fastly-request-id: 8950ac9ba08189557d648c19b1b288f6a81196cb content-length: 162 HTTP/2 200 server: GitHub.com content-type: text/html; charset=utf-8 x-origin-cache: HIT last-modified: Mon, 22 Sep 2025 01:08:47 GMT access-control-allow-origin: * etag: W/"68d0a19f-2e9e" expires: Mon, 29 Dec 2025 08:38:52 GMT cache-control: max-age=600 content-encoding: gzip x-proxy-cache: MISS x-github-request-id: F451:15317B:889843:996D5E:69523BC4 accept-ranges: bytes age: 0 date: Mon, 29 Dec 2025 08:28:52 GMT via: 1.1 varnish x-served-by: cache-bom-vanm7210024-BOM x-cache: MISS x-cache-hits: 0 x-timer: S1766996933.664685,VS0,VE204 vary: Accept-Encoding x-fastly-request-id: 93d34c39ed0345603b309236a9ff5322be500203 content-length: 3289 The 8-Point Algorithm as an Inductive Bias for Relative Pose Prediction by ViTs
The 8-Point Algorithm as an Inductive Bias for Relative Pose Prediction by ViTs

Chris Rockwell

Justin Johnson

David F. Fouhey

University of Michigan

3DV 2022

[GitHub]

[arXiv]

[BibTex]

Figure 1. We propose three small modifications to a ViT via the Essential Matrix Module, enabling computations similar to the Eight-Point algorithm. The resulting mix of visual and positional features is a good inductive bias for pose estimation.

Abstract

We present a simple baseline for directly estimating the relative pose (rotation and translation, including scale) between two images. Deep methods have recently shown strong progress but often require complex or multi-stage architectures. We show that a handful of modifications can be applied to a Vision Transformer (ViT) to bring its computations close to the Eight-Point Algorithm. This inductive bias enables a simple method to be competitive in multiple settings, often substantially improving over the state of the art with strong performance gains in limited data regimes.

Approach

Figure 2. Essential Matrix Module. We make three small changes to standard ViT Cross-Attention: (1) appending positional encodings to Values, (2) applying a dual softmax on Affinities, and (3) applying bilinear attention.

Paper and Supplemental Material

Rockwell, Johnson and Fouhey.
The 8-Point Algorithm as an Inductive Bias for Relative Pose Prediction by ViTs.
In 3DV 2022. (Hosted on arXiv)

[Paper+Supplemental] [Code]

            @inProceedings{Rockwell2022,
              author = {Chris Rockwell and Justin Johnson and David F. Fouhey},
              title = {The 8-Point Algorithm as an Inductive Bias for Relative Pose Prediction by ViTs},
              booktitle = {3DV},
              year = 2022
            }

Acknowledgements

Thanks to Linyi Jin, Ruojin Cai and Zach Teed for help replicating and building upon their works. Thanks to Mohamed El Banani, Karan Desai and Nilesh Kulkarni for their many helpful suggestions. Thanks to Laura Fink and UM DCO for their tireless support with computing! The webpage template originally came from some colorful folks.

Original Source | Taken Source