HRNet

Last updated on Feb 14, 2021

hrnet_w18

Parameters 21 Million
FLOPs 6 Billion
File Size 81.75 MB
Training Data ImageNet
Training Resources 4x NVIDIA V100 GPUs
Training Time

Training Techniques Nesterov Accelerated Gradient, Weight Decay
Architecture Batch Normalization, Convolution, ReLU, Residual Connection
ID hrnet_w18
Epochs 100
Layers 18
Crop Pct 0.875
Momentum 0.9
Batch Size 256
Image Size 224
Weight Decay 0.001
Interpolation bilinear
SHOW MORE
SHOW LESS
hrnet_w18_small

Parameters 13 Million
FLOPs 2 Billion
File Size 50.48 MB
Training Data ImageNet
Training Resources 4x NVIDIA V100 GPUs
Training Time

Training Techniques Nesterov Accelerated Gradient, Weight Decay
Architecture Batch Normalization, Convolution, ReLU, Residual Connection
ID hrnet_w18_small
Epochs 100
Layers 18
Crop Pct 0.875
Momentum 0.9
Batch Size 256
Image Size 224
Weight Decay 0.001
Interpolation bilinear
SHOW MORE
SHOW LESS
hrnet_w18_small_v2

Parameters 16 Million
FLOPs 3 Billion
File Size 59.78 MB
Training Data ImageNet
Training Resources 4x NVIDIA V100 GPUs
Training Time

Training Techniques Nesterov Accelerated Gradient, Weight Decay
Architecture Batch Normalization, Convolution, ReLU, Residual Connection
ID hrnet_w18_small_v2
Epochs 100
Layers 18
Crop Pct 0.875
Momentum 0.9
Batch Size 256
Image Size 224
Weight Decay 0.001
Interpolation bilinear
SHOW MORE
SHOW LESS
hrnet_w30

Parameters 38 Million
FLOPs 10 Billion
File Size 144.44 MB
Training Data ImageNet
Training Resources 4x NVIDIA V100 GPUs
Training Time

Training Techniques Nesterov Accelerated Gradient, Weight Decay
Architecture Batch Normalization, Convolution, ReLU, Residual Connection
ID hrnet_w30
Epochs 100
Layers 30
Crop Pct 0.875
Momentum 0.9
Batch Size 256
Image Size 224
Weight Decay 0.001
Interpolation bilinear
SHOW MORE
SHOW LESS
hrnet_w32

Parameters 41 Million
FLOPs 12 Billion
File Size 157.88 MB
Training Data ImageNet
Training Resources 4x NVIDIA V100 GPUs
Training Time 60 hours

Training Techniques Nesterov Accelerated Gradient, Weight Decay
Architecture Batch Normalization, Convolution, ReLU, Residual Connection
ID hrnet_w32
Epochs 100
Layers 32
Crop Pct 0.875
Momentum 0.9
Batch Size 256
Image Size 224
Weight Decay 0.001
Interpolation bilinear
SHOW MORE
SHOW LESS
hrnet_w40

Parameters 58 Million
FLOPs 16 Billion
File Size 220.20 MB
Training Data ImageNet
Training Resources 4x NVIDIA V100 GPUs
Training Time

Training Techniques Nesterov Accelerated Gradient, Weight Decay
Architecture Batch Normalization, Convolution, ReLU, Residual Connection
ID hrnet_w40
Epochs 100
Layers 40
Crop Pct 0.875
Momentum 0.9
Batch Size 256
Image Size 224
Weight Decay 0.001
Interpolation bilinear
SHOW MORE
SHOW LESS
hrnet_w44

Parameters 67 Million
FLOPs 19 Billion
File Size 256.50 MB
Training Data ImageNet
Training Resources 4x NVIDIA V100 GPUs
Training Time

Training Techniques Nesterov Accelerated Gradient, Weight Decay
Architecture Batch Normalization, Convolution, ReLU, Residual Connection
ID hrnet_w44
Epochs 100
Layers 44
Crop Pct 0.875
Momentum 0.9
Batch Size 256
Image Size 224
Weight Decay 0.001
Interpolation bilinear
SHOW MORE
SHOW LESS
hrnet_w48

Parameters 77 Million
FLOPs 22 Billion
File Size 296.21 MB
Training Data ImageNet
Training Resources 4x NVIDIA V100 GPUs
Training Time 80 hours

Training Techniques Nesterov Accelerated Gradient, Weight Decay
Architecture Batch Normalization, Convolution, ReLU, Residual Connection
ID hrnet_w48
Epochs 100
Layers 48
Crop Pct 0.875
Momentum 0.9
Batch Size 256
Image Size 224
Weight Decay 0.001
Interpolation bilinear
SHOW MORE
SHOW LESS
hrnet_w64

Parameters 128 Million
FLOPs 37 Billion
File Size 489.30 MB
Training Data ImageNet
Training Resources 4x NVIDIA V100 GPUs
Training Time

Training Techniques Nesterov Accelerated Gradient, Weight Decay
Architecture Batch Normalization, Convolution, ReLU, Residual Connection
ID hrnet_w64
Epochs 100
Layers 64
Crop Pct 0.875
Momentum 0.9
Batch Size 256
Image Size 224
Weight Decay 0.001
Interpolation bilinear
SHOW MORE
SHOW LESS
README.md

Summary

HRNet, or High-Resolution Net, is a general purpose convolutional neural network for tasks like semantic segmentation, object detection and image classification. It is able to maintain high resolution representations through the whole process. We start from a high-resolution convolution stream, gradually add high-to-low resolution convolution streams one by one, and connect the multi-resolution streams in parallel. The resulting network consists of several ($4$ in the paper) stages and the $n$th stage contains $n$ streams corresponding to $n$ resolutions. The authors conduct repeated multi-resolution fusions by exchanging the information across the parallel streams over and over.

How do I load this model?

To load a pretrained model:

import timm
m = timm.create_model('hrnet_w30', pretrained=True)
m.eval()

Replace the model name with the variant you want to use, e.g. hrnet_w30. You can find the IDs in the model summaries at the top of this page.

How do I train this model?

You can follow the timm recipe scripts for training a new model afresh.

Citation

@misc{sun2019highresolution,
      title={High-Resolution Representations for Labeling Pixels and Regions}, 
      author={Ke Sun and Yang Zhao and Borui Jiang and Tianheng Cheng and Bin Xiao and Dong Liu and Yadong Mu and Xinggang Wang and Wenyu Liu and Jingdong Wang},
      year={2019},
      eprint={1904.04514},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Results

Image Classification on ImageNet

Image Classification on ImageNet
MODEL TOP 1 ACCURACY TOP 5 ACCURACY
hrnet_w64 79.46% 94.65%
hrnet_w48 79.32% 94.51%
hrnet_w40 78.93% 94.48%
hrnet_w44 78.89% 94.37%
hrnet_w32 78.45% 94.19%
hrnet_w30 78.21% 94.22%
hrnet_w18 76.76% 93.44%
hrnet_w18_small_v2 75.11% 92.41%
hrnet_w18_small 72.34% 90.68%