| CARVIEW |
Select Language
HTTP/2 200
server: GitHub.com
content-type: text/html; charset=utf-8
last-modified: Tue, 25 Aug 2020 16:10:21 GMT
access-control-allow-origin: *
strict-transport-security: max-age=31556952
etag: W/"5f4537ed-33ea"
expires: Sun, 28 Dec 2025 15:04:49 GMT
cache-control: max-age=600
content-encoding: gzip
x-proxy-cache: MISS
x-github-request-id: 7A18:3ABDEF:7AAAEF:89A4C6:695144B8
accept-ranges: bytes
age: 0
date: Sun, 28 Dec 2025 14:54:49 GMT
via: 1.1 varnish
x-served-by: cache-bom-vanm7210053-BOM
x-cache: MISS
x-cache-hits: 0
x-timer: S1766933690.707836,VS0,VE203
vary: Accept-Encoding
x-fastly-request-id: 02d0d6c68c55db031e296256343886dd97547127
content-length: 3680
Side-tuning
Side-Tuning: A Baseline for
Network Adaptation via Additive Side Networks
(Full 10min video below)
When training a neural network for a desired task, one may prefer to adapt a pre-trained network rather than start with a randomly initialized one -- due to lacking enough training data, performing lifelong learning where the system has to learn a new task while being previously trained for other tasks, or wishing to encode priors in the network via preset weights. The most commonly employed approaches for network adaptation are fine-tuning and using the pre-trained network as a fixed feature extractor, among others.
In this paper we propose a straightforward alternative: Side-Tuning. Side-tuning adapts a pre-trained network by training a lightweight "side" network that is fused with the (unchanged) pre-trained network using a simple additive process. This simple method works as well as or better than existing solutions while it resolves some of the basic issues with fine-tuning, fixed features, and several other common baselines. In particular, side-tuning is less prone to overfitting when little training data is available, yields better results than using a fixed feature extractor, and does not suffer from catastrophic forgetting in lifelong learning. We demonstrate the performance of side-tuning under a diverse set of scenarios, including lifelong learning (iCIFAR, Taskonomy), reinforcement learning, imitation learning (visual navigation in Habitat), NLP question-answering (SQuAD v2), and single-task transfer learning (Taskonomy), with consistently promising results.
In ECCV 2020 (Spotlight)
[GitHub]
Network Adaptation via Additive Side Networks
|
|
|
|
|
|
|
|
(Full 10min video below)
Abstract
In this paper we propose a straightforward alternative: Side-Tuning. Side-tuning adapts a pre-trained network by training a lightweight "side" network that is fused with the (unchanged) pre-trained network using a simple additive process. This simple method works as well as or better than existing solutions while it resolves some of the basic issues with fine-tuning, fixed features, and several other common baselines. In particular, side-tuning is less prone to overfitting when little training data is available, yields better results than using a fixed feature extractor, and does not suffer from catastrophic forgetting in lifelong learning. We demonstrate the performance of side-tuning under a diverse set of scenarios, including lifelong learning (iCIFAR, Taskonomy), reinforcement learning, imitation learning (visual navigation in Habitat), NLP question-answering (SQuAD v2), and single-task transfer learning (Taskonomy), with consistently promising results.