Exporters From Japan

HOME
ABOUT
- RESULTS
- differences
- BENEFITS
- HISTORY
- TEAM
- LOCATION
- FACILITIES
- BANKING
- MEMBERSHIPS
- APPROVALS
- LICENCES
- SUPPLIERS
- SPONSORSHIPS
- MEDIA
- PRIVACY
AUCTIONS
SHIPPING
FEES
- TS REWARDS
TOOLS
guides
FAQ
CONTACT
- CONNECT

VEHICLES
BRAND
- JAPANESE CARS
  - DAIHATSU
  - EUNOS
  - FORD
  - HONDA
  - ISUZU
  - LEXUS
  - MAZDA
  - MITSUBISHI
  - MITSUOKA
  - NISSAN
  - SUBARU
  - SUZUKI
  - TOYOTA
- GERMAN CARS
- AMERICAN CARS
- BRITISH CARS
- ITALIAN CARS
- FRENCH CARS
- SWEDISH CARS
- KOREAN CARS
TYPE
- mobility
- VENDING
- instruction
- TAXIS
- AMBULANCES
- FIRE ENGINES
- HEARSES
- LIMOUSINES
- COMMERCIAL
CLASS
FUEL
TRUCKS
minitrucks
- DAIHATSU
- HONDA
- MAZDA
- MITSUBISHI
- NISSAN
- SUBARU
- SUZUKI
- DUMP
- CRANE
- CAMPER
- REFRIGERATED
- 4WD
- NEW
BUSES
MOTORHOMES
- YAHOO!
- RAKUTEN
- DEALER

PARTS
- FREE REPORT
- PARTS CONTAINERS
- PARTS SYSTEMS
- PARTS PROTECTION
- BODY SHELLS
- DISMANTLING
- ONLINE PARTS
- NEW PARTS
- INTERIOR PARTS
- EXTERIOR PARTS
  - BONNETS
  - BUMPERS
  - GRILLES
  - FENDERS
  - DOORS
  - TRUNKS
  - SPOILERS
  - LIGHTS
  - EMBLEMS
  - CAMERAS
- ENGINES
- TRANSMISSIONS
- WHEELS & TYRES
  - WHEELS
  - TYRES
CUTS
PERFORMANCE PARTS
TRUCK PARTS
MOTORBIKE PARTS
- MOTORBIKE ENGINES
- MOTORBIKE ACCESSORIES

MOTORBIKES
MARINE
FORKLIFTS
MACHINERY
AGRICULTURAL
OTHER
COUNTRY
- AUSTRALIA
- CANADA
- KENYA
- MYANMAR
- NEW ZEALAND
- PAKISTAN
- TANZANIA
- UNITED STATES

CARVIEW

MOTORHOMES

Select Language

HTTP/2 301 server: GitHub.com content-type: text/html location: https://peter-kocsis.github.io/LowDataGeneralization/ x-github-request-id: E224:3FD64F:9A521D:AD59C0:69535EF0 accept-ranges: bytes age: 0 date: Tue, 30 Dec 2025 05:11:13 GMT via: 1.1 varnish x-served-by: cache-bom-vanm7210051-BOM x-cache: MISS x-cache-hits: 0 x-timer: S1767071473.918361,VS0,VE198 vary: Accept-Encoding x-fastly-request-id: d766ee030013231b77435db5d179347588646d17 content-length: 162 HTTP/2 200 server: GitHub.com content-type: text/html; charset=utf-8 last-modified: Fri, 28 Nov 2025 15:56:49 GMT access-control-allow-origin: * strict-transport-security: max-age=31556952 etag: W/"6929c641-780d" expires: Tue, 30 Dec 2025 05:21:13 GMT cache-control: max-age=600 content-encoding: gzip x-proxy-cache: MISS x-github-request-id: CE01:292AC1:9A0F46:AD1503:69535EF1 accept-ranges: bytes age: 0 date: Tue, 30 Dec 2025 05:11:13 GMT via: 1.1 varnish x-served-by: cache-bom-vanm7210051-BOM x-cache: MISS x-cache-hits: 0 x-timer: S1767071473.130312,VS0,VE219 vary: Accept-Encoding x-fastly-request-id: 8bb29645f4c503a5c72af56f2a11b17815b563d6 content-length: 6314 Peter Kocsis

The Unreasonable Effectiveness of Fully-Connected Layers for Low-Data Regimes

NeurIPS 2022

Peter Kocsis

TU Munich

Peter Súkeník

IST Austria

Guillem Brasó

TU Munich

Matthias Nießner

TU Munich

Laura Leal-Taixé

TU Munich

Ismail Elezi

TU Munich

Paper

Code

Data

Abstract

Convolutional neural networks were the standard for solving many computer vision tasks until recently, when Transformers of MLP-based architectures have started to show competitive performance. These architectures typically have a vast number of weights and need to be trained on massive datasets; hence, they are not suitable for their use in low-data regimes. In this work, we propose a simple yet effective framework to improve generalization from small amounts of data. We augment modern CNNs with fully-connected (FC) layers and show the massive impact this architectural change has in low-data regimes. We further present an online joint knowledge-distillation method to utilize the extra FC layers at train time but avoid them during test time. This allows us to improve the generalization of a CNN-based model without any increase in the number of weights at test time. We perform classification experiments for a large range of network backbones and several standard datasets on supervised learning and active learning. Our experiments significantly outperform the networks without fully-connected layers, reaching a relative improvement of up to 16% validation accuracy in the supervised setting without adding any extra parameters during inference.

Method

Feature Refiner

Feature Refiner

We propose a simple yet effective framework for improving the generalization from a small amount of data. In our work, we bring back fully-connected layers at the end of CNN-based architectures. We show that by adding as little as 0.37% extra parameters during training, we can significantly improve the generalization in the low-data regime. Our network architecture consists of two main parts: a convolutional backbone network and our proposed Feature Refiner (FR) based on multi-layer perceptrons. Our method is task and model-agnostic and can be applied to many convolutional networks. In our method, we extract features with the convolutional backbone network. Then, we apply our FR followed by a task-specific head. More precisely, we first reduce the feature dimension dbbf to df rf with a single linear layer to reduce the number of extra parameters. Then we apply a symmetric two-layer multi-layer perceptron wrapped around by normalization layers.

Online Joint Knowledge Distillation

Online Joint Knowledge Distillation

One could argue that using more parameters can improve the performance just because of the increased expressivity of the network. To disprove this argument, we develop an online joint knowledge distillation (OJKD) method. Our OJKD enables us to use the exact same architecture as our baseline networks during inference and utilizes our FR solely during training.

Experiments

We compare the results of our method with those of ResNet18. On the first training cycle (1000 labels), our method outperforms ResNet18 by 7.6 percentage points (pp). On the second cycle, we outperform ResNet18 by more than 10pp. We keep outperforming ResNet18 until the seventh cycle, where our improvement is half a percentage point. For the remaining iterations, both methods reach the same accuracy. A common tendency for all datasets is that with an increasing number of labeled samples, the gap between our method and the baseline shrinks. Therefore, dropping the fully-connected layers in case of a large labeled dataset does not cause any disadvantage, as was found in [6]. However, that work did not analyze this question in the low-data regime, where using FC layers after CNN architectures is clearly beneficial.

We check if our method can be used with other backbones than ResNet18. The goal of the experiment is to show that our method is backbone agnostic and generalizes both to different versions of ResNet as well as to other types of convolutional neural networks. As we can see, our method significantly outperforms the baselines on both datasets and for all three types of backbones.

Citation

@inproceedings{kocsis2022lowdataregime,
author = {Peter Kocsis
and Peter S\'{u}ken\'{i}k
and Guillem Bras\'{o}
and Matthias Nie{\ss}ner
and Laura Leal-Taix\'{e}
and Ismail Elezi},
title = {The Unreasonable Effectiveness of Fully-Connected Layers for Low-Data Regimes},
booktitle = {Proc. NeurIPS},
year={2022}
}

Citation copied!

HOME
ABOUT
AUCTIONS
SHIPPING
FEES
TOOLS
HOW
FAQ
CONTACT

Original Source | Taken Source