You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Efficiently exploiting multi-modal inputs for accurate RGB-D saliency detection is a topic of high interest. Most existing works leverage cross-modal interactions to fuse the two streams of RGB-D for intermediate features' enhancement. In this process, a practical aspect of the low quality of the available depths has not been fully considered yet. In this work, we aim for RGB-D saliency detection that is robust to the low-quality depths which primarily appear in two forms: inaccuracy due to noise and the misalignment to RGB. To this end, we propose a robust RGB-D fusion method that benefits from (1) layer-wise, and (2) trident spatial, attention mechanisms. On the one hand, layer-wise attention (LWA) learns the trade-off between early and late fusion of RGB and depth features, depending upon the depth accuracy. On the other hand, trident spatial attention (TSA) aggregates the features from a wider spatial context to address the depth misalignment problem.
The proposed LWA and TSA mechanisms allow us to efficiently exploit the multi-modal inputs for saliency detection while being robust against low-quality depths. Our experiments on five benchmark datasets demonstrate that the proposed fusion method performs consistently better than the state-of-the-art fusion alternatives.
Train and Test
One key:
python run_RFNet.py
Saliency Maps
The saliency maps can be downloaded here: Google Drive
Citation
If you find this repo useful, please consider citing:
@INPROCEEDINGS{wu2022robust,
author={Wu, Zongwei and Gobichettipalayam, Shriarulmozhivarman and Tamadazte, Brahim and Allibert, Guillaume and Paudel, Danda Pani and Demonceaux, Cédric},
booktitle={2022 International Conference on 3D Vision (3DV)},
title={Robust RGB-D Fusion for Saliency Detection},
year={2022},
volume={},
number={},
pages={403-413},
doi={10.1109/3DV57658.2022.00052}}
Related works
ICCV 23 - Source-free Depth for Object Pop-out [Code]
ACMMM 23 - Object Segmentation by Mining Cross-Modal Semantics [Code)]
TIP 23 - HiDANet: RGB-D Salient Object Detection via Hierarchical Depth Awareness [Code]
3DV 21 - Modality-Guided Subnetwork for Salient Object Detection [Code]
Acknowledgments
This repository is heavily based on SPNet. Thanks to their great work!
About
[3DV 22] Official implementation of Robust RGB-D Fusion Network for Saliency Detection