TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Image Classification	CIFAR-10	WRN + fixup init + mixup + cutout	Percentage correct	97.7	# 68
Image Classification	CIFAR-10	WRN + fixup init + mixup + cutout	PARAMS	18M	# 204
Image Classification	SVHN	WRN + fixup init + mixup + cutout	Percentage error	1.4	# 9

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/fixup-initialization-residual-learning/image-classification-on-svhn)](https://paperswithcode.com/sota/image-classification-on-svhn?p=fixup-initialization-residual-learning)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/fixup-initialization-residual-learning/image-classification-on-cifar-10)](https://paperswithcode.com/sota/image-classification-on-cifar-10?p=fixup-initialization-residual-learning)`

Fixup Initialization: Residual Learning Without Normalization

ICLR 2019 · Hongyi Zhang, Yann N. Dauphin, Tengyu Ma ·

Normalization layers are a staple in state-of-the-art deep neural network architectures. They are widely believed to stabilize training, enable higher learning rate, accelerate convergence and improve generalization, though the reason for their effectiveness is still an active research topic. In this work, we challenge the commonly-held beliefs by showing that none of the perceived benefits is unique to normalization. Specifically, we propose fixed-update initialization (Fixup), an initialization motivated by solving the exploding and vanishing gradient problem at the beginning of training via properly rescaling a standard initialization. We find training residual networks with Fixup to be as stable as training with normalization -- even for networks with 10,000 layers. Furthermore, with proper regularization, Fixup enables residual networks without normalization to achieve state-of-the-art performance in image classification and machine translation.

PDF Abstract ICLR 2019 PDF ICLR 2019 Abstract

Code

Add Remove Mark official

hongyi-zhang/Fixup

149

bzhangGo/zero

145

Zelgunn/CustomKerasLayers

AngusG/bn-advex-zhang-fixup

yanivbl6/fixup

See all 7 implementations

Tasks

Add Remove

General Classification

Image Classification

Machine Translation

Translation

Datasets

CIFAR-10

SVHN

Results from the Paper

Edit

Ranked #9 on Image Classification on SVHN

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Image Classification	CIFAR-10	WRN + fixup init + mixup + cutout	Percentage correct	97.7	# 68	Compare
Image Classification	CIFAR-10	WRN + fixup init + mixup + cutout	PARAMS	18M	# 204	Compare
Image Classification	SVHN	WRN + fixup init + mixup + cutout	Percentage error	1.4	# 9	Compare

Methods

Add Remove

Fixup Initialization

Edit Social Preview

Fixup Initialization: Residual Learning Without Normalization

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove