[WACV'25] CorrFill: Enhancing Faithfulness in Reference-based Inpainting with Correspondence Guidance in Diffusion Models

The Official PyTorch implementation of CorrFill: Enhancing Faithfulness in Reference-based Inpainting with Correspondence Guidance in Diffusion Models (WACV'25).

Kuan-Hung Liu¹, Cheng-Kun Yang*², Min-Hung Chen³, Yu-Lun Liu¹, Yen-Yu Lin¹
¹National Yang Ming Chiao Tung University, ²National Taiwan University, ³NVIDIA
(*Now at MediaTek Inc., Taiwan.)

[Paper] [Website] [BibTeX] [WACV'25 Poster]

This work introduces CorrFill, a training-free module designed to enhance the faithfulness of reference-based image inpainting in diffusion models. CorrFill guides the inpainting process with correspondence between the reference and target images, estimated during the inpainting process, by constraining the inpainting process of diffusion models through self-attention masking and input latent optimization. We conduct experiments on RealEstate10K and MegaDepth with four different baseline diffusion models, which demonstrate higher faithfulness in both quantitative and qualitative results.

For business inquiries, please visit our website and submit the form: NVIDIA Research Licensing.

Installation

Prerequisite

pytorch 2.0.1 & torchvision

Environment Setup

This code base requires Python 3.10.

git clone git@github.com:khliu0000/corr_inpainting.git
cd corr_inpainting
conda env create --file environment.yml python=3.10

Experiment Setup

Models

Most of the required models will be automatically downloaded through huggingface when first executed.

Reproducing IP-Adapter-Plus with CorrFill requires runwayml's Stable Diffusion, which is currently unavailable. However, you can use other inpainting models that are compatible with IP-Adapter-Plus.

The image encoder of IP-Adapter-Plus and the weights for the adapter are required to reproduce results for IP-Adapter-Plus with CorrFill.
Please download models/image_encoder/* and models/ip-adapter-plus_sd15.bin from the huggingface repository, and place it as this structure:

corr_inpainting    # repo root
├── IP-Adapter
│   └── models
│       ├── image_encoder
│       └── ip-adapter-plus_sd15.bin
...

Dataset

Due to copyright restrictions, we cannot provide our testing data. However, we offer examples of how we generate image pairs and random masks. Please refer to ./data_process/.

How to reproduce experiments

To reproduce the experiments, it is recommended to execute with at least 24GB of memory.
Disable latent input optimization will reduce the amount of memory required, but it will also lower the performance.
- To disable latent optimization, set latent_optimize in the configure files to false in the config files.
The model is in float16 mode by default.
Results may vary slightly with each execution. Please refer to PyTorch's documentation on reproducibility.

The results will be stored in ./results/. Please follow the below commands for reproduction.

LeftRefill

RealEstate10K

python corr_guided.py --config config/leftrefill_realestate.json

MegaDepth

python corr_guided.py --config config/leftrefill_megadepth.json

Side-by-side

RealEstate10K

python corr_guided.py --config config/sidebyside_realestate.json

MegaDepth

python corr_guided.py --config config/sidebyside_megadepth.json

IP-Adapter-Plus

RealEstate10K

python corr_guided.py --config config/ipadapter_realestate.json

MegaDepth

python corr_guided.py --config config/ipadapter_megadepth.json

The directory ipa_embed will be created to cache the image embedding for saving memory.

Paint-by-Example

RealEstate10K

python corr_guided.py --config config/paintbyexample_realestate.json

MegaDepth

python corr_guided.py --config config/paintbyexample_megadepth.json

The directory pbe_embed will be created to cache the image embedding for saving memory.

Code Structure

corr_guided.py: Script for repreduce experiments.
corrfill_module: Implementation of our main method.
- corrfill_module/corrfill.py: Code for CorrFill inpainting method, including attention processor swapping and attention maps collections.
- corrfill_module/cg_pipeline.py: Code for customized diffusers pipeline of CorrFill.
- corrfill_module/pbe_pipeline.py: Code for customized diffusers pipeline of CorrFill, for the experiments of PaintByExample.
- corrfill_module/pbe_image_encoder.py: Code for image encoder used by PaintByExample from diffusers.
- corrfill_module/utils.py: Code loss calculation for both pipelines.
- corrfill_module/share.py: Shared objects used globally.
data_process: Exemplar scripts for dataset processing for reference.
ip_adapter: IP-Adapter modules for experiments of IP-Adapter Plus from IP-Adapter.

Acknowledgement

Code

The implementation is based on huggingface diffusers.
The implementation of latent optimization is inspired by Diffusion Self-Guidance for Controllable Image Generation and HD-Painter.
The IP-Adapter Plus module is adapted from tencent-ailab.
The PaintByExapmle module is implemented by diffusers.
The implementation of LeftRefill is adapted from ewrfcas.
The example script for RealEstate10K collection is adapted from RealEstate10K downloader.
The example script for mask generation is adapted from Tangshitao's implementation of QuadTreeAttention for LoFTR.

Weights and Biases

The learned prompt embeds of LeftRefill are extracted from the release weights of LeftRefill.

Dataset

RealEstate10K dataset is extracted Youtube frames annotated by Google.
MegaDepth dataset is collected from MegaDepth project.

Citation

If you find CorrFill useful, please consider giving a star and citation:

@inproceedings{liu2025corrfill,
  title={CorrFill: Enhancing Faithfulness in Reference-based Inpainting with Correspondence Guidance in Diffusion Models},
  author={Liu, Kuan-Hung and Yang, Cheng-Kun and Chen, Min-Hung and Liu, Yu-Lun and Lin, Yen-Yu},
  booktitle={WACV},
  year={2025}
}

Licenses

This work is made available under the NVIDIA Source Code License-NC. Click here to view a copy of this license.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
assets		assets
config		config
corrfill_module		corrfill_module
data_process		data_process
ip_adapter		ip_adapter
leftrefill		leftrefill
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
corr_guided.py		corr_guided.py
environment.yml		environment.yml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

[WACV'25] CorrFill: Enhancing Faithfulness in Reference-based Inpainting with Correspondence Guidance in Diffusion Models

Installation

Prerequisite

Environment Setup

Experiment Setup

Models

Dataset

How to reproduce experiments

LeftRefill

Side-by-side

IP-Adapter-Plus

Paint-by-Example

Code Structure

Acknowledgement

Code

Weights and Biases

Dataset

Citation

Licenses

About

Uh oh!

Uh oh!

Languages

License

khliu0000/CorrFill

Folders and files

Latest commit

History

Repository files navigation

[WACV'25] CorrFill: Enhancing Faithfulness in Reference-based Inpainting with Correspondence Guidance in Diffusion Models

Installation

Prerequisite

Environment Setup

Experiment Setup

Models

Dataset

How to reproduce experiments

LeftRefill

Side-by-side

IP-Adapter-Plus

Paint-by-Example

Code Structure

Acknowledgement

Code

Weights and Biases

Dataset

Citation

Licenses

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Uh oh!

Languages