Wide Residual Networks

23 May 2016Sergey Zagoruyko • Nikos Komodakis

Deep residual networks were shown to be able to scale up to thousands of layers and still have improving performance. However, each fraction of a percent of improved accuracy costs nearly doubling the number of layers, and so training very deep residual networks has a problem of diminishing feature reuse, which makes these networks very slow to train. To tackle these problems, in this paper we conduct a detailed experimental study on the architecture of ResNet blocks, based on which we propose a novel architecture where we decrease depth and increase width of residual networks.

Full paper


Task Dataset Model Metric name Metric value Global rank Compare
Image Classification CIFAR-10 Wide ResNet Percentage error 3.89 # 63
Image Classification CIFAR-100 Wide ResNet Percentage error 18.85 # 42
Image Classification SVHN Wide ResNet Percentage error 1.7 # 6