TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Image Classification	CIFAR-10	WRN 28-14	Percentage correct	97.45	# 75
Image Classification	CIFAR-10	WRN 28-14	PARAMS	36.5M	# 221
Image Classification	CIFAR-10	WRN 28-10	Percentage correct	96.81	# 93
Image Classification	CIFAR-100	WRN 28-10	Percentage correct	83.06	# 92
Image Classification	CIFAR-100	WRN 28-14	Percentage correct	85.00	# 70

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/economical-ensembles-with-hypernetworks/image-classification-on-cifar-100)](https://paperswithcode.com/sota/image-classification-on-cifar-100?p=economical-ensembles-with-hypernetworks)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/economical-ensembles-with-hypernetworks/image-classification-on-cifar-10)](https://paperswithcode.com/sota/image-classification-on-cifar-10?p=economical-ensembles-with-hypernetworks)`

Neural networks with late-phase weights

ICLR 2021 · Johannes von Oswald, Seijin Kobayashi, Alexander Meulemans, Christian Henning, Benjamin F. Grewe, João Sacramento ·

The largely successful method of training neural networks is to learn their weights using some variant of stochastic gradient descent (SGD). Here, we show that the solutions found by SGD can be further improved by ensembling a subset of the weights in late stages of learning. At the end of learning, we obtain back a single model by taking a spatial average in weight space. To avoid incurring increased computational costs, we investigate a family of low-dimensional late-phase weight models which interact multiplicatively with the remaining parameters. Our results show that augmenting standard models with late-phase weights improves generalization in established benchmarks such as CIFAR-10/100, ImageNet and enwik8. These findings are complemented with a theoretical analysis of a noisy quadratic problem which provides a simplified picture of the late phases of neural network learning.

PDF Abstract ICLR 2021 PDF ICLR 2021 Abstract

Code

Add Remove Mark official

google/uncertainty-baselines official

↳ Quickstart in

Colab

1,363

seijin-kobayashi/late-phase-weights official

Tasks

Add Remove

Image Classification

Datasets

CIFAR-10

CIFAR-100

Results from the Paper

Edit

Ranked #70 on Image Classification on CIFAR-100 (using extra training data)

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Image Classification	CIFAR-10	WRN 28-14	Percentage correct	97.45	# 75	Compare
Image Classification	CIFAR-10	WRN 28-14	PARAMS	36.5M	# 221	Compare
Image Classification	CIFAR-10	WRN 28-10	Percentage correct	96.81	# 93	Compare
Image Classification	CIFAR-100	WRN 28-10	Percentage correct	83.06	# 92	Compare
Image Classification	CIFAR-100	WRN 28-14	Percentage correct	85.00	# 70	Compare

Methods

Add Remove

SGD

Edit Social Preview

Neural networks with late-phase weights

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove