CARVIEW

MOTORHOMES

Select Language

HTTP/2 200 server: GitHub.com content-type: text/html; charset=utf-8 last-modified: Wed, 29 May 2024 22:07:27 GMT access-control-allow-origin: * strict-transport-security: max-age=31556952 etag: W/"6657a71f-caf4" expires: Sun, 28 Dec 2025 11:14:30 GMT cache-control: max-age=600 content-encoding: gzip x-proxy-cache: MISS x-github-request-id: 1B01:3A7A40:7816B4:86A88A:69510EBD accept-ranges: bytes age: 0 date: Sun, 28 Dec 2025 11:04:30 GMT via: 1.1 varnish x-served-by: cache-bom-vanm7210088-BOM x-cache: MISS x-cache-hits: 0 x-timer: S1766919870.412049,VS0,VE206 vary: Accept-Encoding x-fastly-request-id: 7af4bbe234b6e1acdc0999c50f8fe87b3e1bd3c4 content-length: 8775 In-N-Out

In-N-Out: Faithful 3D GAN Inversion with Volumetric Decomposition for Face Editing

Yiran Xu^🐢, Zhixin Shu, Cameron Smith, Seoung Wug Oh Jia-Bin Huang^🐢,

^🐢University of Maryland, College Park, Adobe Research

CVPR 2024

arXiv Supp PDF Video Code Data BibTex

Abstract

3D-aware GANs offer new capabilities for view synthesis while preserving the editing functionalities of their 2D counterparts. GAN inversion is a crucial step that seeks the latent code to reconstruct input images or videos, subsequently enabling diverse editing tasks through manipulation of this latent code. However, a model pre-trained on a particular dataset (e.g., FFHQ) often has difficulty reconstructing images with out-of-distribution (OOD) objects such as faces with heavy make-up or occluding objects. We address this issue by explicitly modeling OOD objects from the input in 3D-aware GANs. Our core idea is to represent the image using two individual neural radiance fields: one for the in-distribution content and the other for the out-of-distribution object. The final reconstruction is achieved by optimizing the composition of these two radiance fields with carefully designed regularization. We demonstrate that our explicit decomposition alleviates the inherent trade-off between reconstruction fidelity and editability. We evaluate reconstruction accuracy and editability of our method on challenging real face images and videos and showcase favorable results against other baselines.

Method Overview

Single image

Reconstruction on Internet Images

Our method can be applied to diverse images from the Internet. We show reconstruction results from single images below.

Slide to switch between different images

0/10

Click to switch between different methods.

W W+ IDE-3D PTI GOAE HFGI3D

Input

W

Ours

Semantic editing on Internet Images

We can edit it with available semantic editing methods, e.g., InterfaceGAN, StyleCLIP.

Slide to switch between different images

0/5

Click to switch between different methods or editing directions.

Method: W W+ IDE-3D PTI GOAE HFGI3D

Editing: eyeglasses Elsa younger smile surprised

eyeglasses

View synthesis on Internet Images

Unlike HFGI3D, which warps and computes visibility map, resulting a time-consuming optimization, our method only relies on a depth map from MiDaS, and can synthesize faithful views.

Slide to switch between different images

0/3

Method: W W+ IDE-3D PTI GOAE HFGI3D

Videos

Reconstruction on Internet Videos

Our method can be applied to diverse videos from the Internet. We show reconstruction results below.

Slide to switch between different videos

0/2

Method: W W+ IDE-3D PTI GOAE HFGI3D VIVE3D

Semantic editing on Internet Videos

We can edit it with available semantic editing methods, e.g., InterfaceGAN, StyleCLIP.

Slide to switch between different videos.

0/3

Method: W W+ IDE-3D PTI GOAE HFGI3D VIVE3D

Editing: eyeglasses Elsa younger smile surprised

eyeglasses

View synthesis on Internet videos

After reconstruction, we can acquire novel views.

Slide to switch between different videos

0/3

Method: W W+ IDE-3D PTI GOAE HFGI3D VIVE3D

OOD object removal on Internet videos

By setting the weight of OOD pixels to 0, we can remove the OOD object.

Ablation study

Blending weight regularization Latent regularization

w/o blending weight regularization

w/ blending weight regularization (Eqn.6)

w/o blending weight regularization

w/ blending weight regularization (Eqn.6)

Failure cases

OOD dominates: When editing on the OOD region, e.g., adding eyeglasses to the heavy makeup region, because the blending weights are closer to 1, the eyeglasses in the in-distribution radiance field are hard to be added.

Double glasses: Since our OOD radiance field has no knowledge about the GAN and faces prior, when the OOD object itself is glasses, adding eyeglasses introduces duplicate objects.

BibTeX

@inproceedings{xu2024innout,
  author    = {Xu, Yiran and Shu, Zhixin and Smith, Cameron and Oh, Seoung Wug and Huang, Jia-Bin},
  title     = {In-N-Out: Faithful 3D GAN Inversion with Volumetric Decomposition for Face Editing},
  booktitle = {CVPR},
  year      = {2024},
}

Original Source | Taken Source

In-N-Out: Faithful 3D GAN Inversion with Volumetric Decomposition for Face Editing

Abstract

Method Overview

Single image

Reconstruction on Internet Images

Input

W

Ours

Semantic editing on Internet Images

View synthesis on Internet Images

Videos

Reconstruction on Internet Videos

Semantic editing on Internet Videos

View synthesis on Internet videos

OOD object removal on Internet videos

Ablation study

Failure cases

Related Links

3D-aware GAN

GAN inversion

GAN-based editing

BibTeX