Zitian Zhang, Frédéric Fortier-Chouinard, Mathieu Garon, Anand Bhattad, Jean-François Lalonde
WACV 2025 (Oral)
[Website] [Paper] [Supplementary]
- Clone the repo and submodules
git clone --recursive https://github.com/zzt76/zerocomp.git
Make sure you also clone the submodule predictors
.
- Install required wheels, note that different intrinsic predictors may require different libraries.
pip install -r requirements.txt
- Download Stable Diffussion 2.1 from huggingface, and modify the
pretrained_model_name_or_path
variable inconfigs/sg_labo.yaml
.
Pretrained weights are only available for non-commercial use under CC BY-NC-SA 4.0 license.
Depth, normals, albedo
Openrooms 7days: link
Normals, albedo
Openrooms 2days: link
Depth, normals, albedo, roughness, metallic
Interior Verse 2days: link
Interior Verse 7days: link
In our paper, the footprint depth of the object is needed to align the object depth with the background depth (or the other way around if the depth is relative disparity). However, we notice that the footprint depth is not always available. So here we provide two different solution:
- In the newest version, when the footprint depth is not available, we use the smallest bg depth value inside the object mask as the minimum object depth. Please refer to Line 274-292.
- Another solution is you can use the pretrained model without the depth channel. You can download this model with the link provided above, and change the following arguments in the config file (as in configs/sg_labo_wo_depth.yaml):
conditioning_maps: [normal, diffuse, shading, mask]
eval:
controlnet_model_name_or_path: checkpoints/openrooms_2days_wo_depth
shading_maskout_mode: BBox
shading_maskout_bbox_dilation: 50 # This is a hyperparameter deciding how large we should mask around the object
You can get the intrinsic predictor weights from the original repos, the links are provided in the following. After downloading, move them to .cache/checkpoints
folder.
ZoeDepth: ZoeD_M12_NK.pt DepthAnything: depth_anything_metric_depth_indoor.pt DepthAnythingV2: depth_anything_v2_vitl.pth
OmniDataV2: omnidata_dpt_normal_v2.ckpt StableNormal: stable-normal-v0-1
For diffuse only, you can use our own not that good model dfnet.
For diffuse, roughness and metallic, you can precompute these maps by IntrinsicImageDiffusion, RGB<->X or other predictors. Name them as in the provided test dataset and load them by changing predictor_names
to precompute
.
You can use other predictors you prefer by modifying controlnet_input_handle.py
. Implement handle_***
functions and modify ToPredictors
class.
All predictors are subject to their own licenses. Please check the relating conditions carefully.
You can download the ZeroComp test dataset here.
To run the evaluations as in the paper:
python eval_controlnet_composite.py --config-name sg_labo
ZeroComp trained on Openrooms:
python gradio_composite.py
ZeroComp trained on InteriorVerse, with roughness and metallic:
python gradio_composite_w_rm.py
This research was supported by NSERC grants RGPIN 2020-04799 and ALLRP 586543-23, Mitacs and Depix. Computing resources were provided by the Digital Research Alliance of Canada. We also thank Louis-Étienne Messier and Justine Giroux for their help as well as all members of the lab for discussions and proofreading help.
This implementation builds upon Hugging Face’s Diffusers library. We also acknowledge Gradio for providing a developer-friendly tool to create the interative demos for our models.
If you find it useful, please consider citing ZeroComp:
@InProceedings{zhang2025zerocomp,
author = {Zhang, Zitian and Fortier-Chouinard, Fr\'ed\'eric and Garon, Mathieu and Bhattad, Anand and Lalonde, Jean-Fran\c{c}ois},
title = {ZeroComp: Zero-Shot Object Compositing from Image Intrinsics via Diffusion},
booktitle = {Proceedings of the Winter Conference on Applications of Computer Vision (WACV)},
month = {February},
year = {2025},
pages = {483-494}
}
Our follow-up work SpotLight focuses on training-free local relighting, please feel free to check it out if you're interested :)
@misc{fortierchouinard2025spotlightshadowguidedobjectrelighting,
title={SpotLight: Shadow-Guided Object Relighting via Diffusion},
author={Frédéric Fortier-Chouinard and Zitian Zhang and Louis-Etienne Messier and Mathieu Garon and Anand Bhattad and Jean-François Lalonde},
year={2025},
eprint={2411.18665},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2411.18665},
}
The codes, pretrained weights and test dataset are all for non-commercial use only.
ZeroComp: Zero-shot Object Compositing from
Image Intrinsics via Diffusion by
Zitian
Zhang, Frédéric Fortier-Chouinard, Mathieu Garon, Anand Bhattad,
Jean-François Lalonde is licensed under CC BY-NC-SA 4.0