CARVIEW

MOTORHOMES

Select Language

HTTP/2 200 server: GitHub.com content-type: text/html; charset=utf-8 last-modified: Wed, 08 Jan 2025 04:24:54 GMT access-control-allow-origin: * strict-transport-security: max-age=31556952 etag: W/"677dfe16-4945" expires: Sun, 28 Dec 2025 13:08:50 GMT cache-control: max-age=600 content-encoding: gzip x-proxy-cache: MISS x-github-request-id: 6897:123DE:79B6D9:887B96:6951298A accept-ranges: bytes age: 0 date: Sun, 28 Dec 2025 12:58:51 GMT via: 1.1 varnish x-served-by: cache-bom-vanm7210029-BOM x-cache: MISS x-cache-hits: 0 x-timer: S1766926731.801487,VS0,VE225 vary: Accept-Encoding x-fastly-request-id: 5736aa5e96c04ac0d716f3234ca11c5be8734ad5 content-length: 4241 Instance-Warp: Saliency Guided Image Warping for Unsupervised Domain Adaptation

Instance-Warp: Saliency Guided Image Warping for Unsupervised Domain Adaptation

WACV 2025

Shen Zheng*, Anurag Ghosh*, Srinivasa Narasimhan

* indicates equal contribution

arXiv Code

Boreas Rain

Boreas Snow

Boreas Night

Fog

Highway Fog

Highway Rain

Heavy Snow

Tunnel

Rain

Our adapted detection results on in-the-wild driving videos taken in bad weather.

Overview and Highlights

We oversample salient object regions by warping source-domain images in-place during training while performing domain adaptation.

Technical Highlights include:

First work to show that saliency-guided image warping enhances backbone features and improves model adaptability for unsupervised domain adaptation.
Our approach is agnostic to the adaptation task, target domain, saliency guidance, model architecture, and domain adaptation algorithm.
Our approach is efficient, with minimal training overhead and no additional inference latency.
Our approach improves adaptation across geographies, lighting and weather conditions.

Result Highlights include:

+6.1 mAP50 for BDD100K Clear → DENSE Foggy
+3.7 mAP50 for BDD100K Day → Night
+3.0 mAP50 for BDD100K Clear → Rainy
+6.3 mIoU for Cityscapes → ACDC

Why is Unsupervised Domain Adaptation Difficult?

Unsupervised Domain Adaptation (UDA) is challenging because models are trained with dominant scene backgrounds that appear dramatically different across domains. Specifically:

Object-Background Pixel Imbalance: Backgrounds occupy much more pixels than objects.
Differences in Cross-Domain Object-Background Variations: Backgrounds exhibit significantly larger cross-domain variations than objects.

Why Image Warping for Domain Adaptation?

Our image warping oversamples object regions and undersamples background areas, helping to focus on salient object regions while reducing attention to background context, leading to more robust and adaptable recognition models.

Which Image Regions to Warp?

We propose instance-level saliency guidance, which explicitly oversamples all objects during training. While methods like Static Prior Guidance [Thavamani 2021, Thavamani 2023] or Geometric Prior Guidance [Ghosh 2023] could be used, they are not designed for domain adaptation and do not prioritize object instance regions.

Comparison of InstanceWarp for Domain Adaptation.

Saliency-Guided Warping for Domain Adaptation

Our warping focus on oversampling object regions and undersampling background regions. We warp source images based on saliency guidance and then unwarp the backbone features using the same guidance before making predictions. This method can be seamlessly integrated into existing domain adaptation algorithms and is agnostic to the task, domain adaptation algorithm, saliency guidance, and underlying model architecture. Empirically, we observed that warping source domain images is more effective than warping both source and target domain images. We do not warp or unwarp at test time.

Workflow of InstanceWarp for Domain Adaptation.

Improved Backbone Features

Grad-CAM visualization shows that the model trained with our method demonstrate a higher focus on salient objects, indicating better-learned features and improved scene comprehension.

Improves a variety of adaptation scenarios

As the learned backbone features are better, our approach improves performance across various real-to-real domain adaptation tasks, including changing weather, lighting conditions, and geographies. The results below are from an adapted model pre-trained on BDD100K images taken during the day and in good weather conditions.

Method is Task, Algorithm and Backbone Agnostic

Our method is agnostic to tasks, adaptation algorithms, and backbone architectures. It seamlessly applied to object detection and semantic segmentation, using either ConvNet or Transformer backbones.

Original Source | Taken Source