Deep High-Resolution Representation Learning for Visual Recognition

High-resolution representations are essential for position-sensitive vision problems, such as human pose estimation, semantic segmentation, and object detection. Existing state-of-the-art frameworks first encode the input image as a low-resolution representation through a subnetwork that is formed by connecting high-to-low resolution convolutions \emph{in series} (e.g., ResNet, VGGNet), and then recover the high-resolution representation from the encoded low-resolution representation... (read more)

PDF Abstract

Results from the Paper


TASK DATASET MODEL METRIC NAME METRIC VALUE GLOBAL RANK RESULT BENCHMARK
Semantic Segmentation CamVid HRNetV2 (HRNetV2-W48) Mean IoU 78.47% # 2
Semantic Segmentation Cityscapes val HRNetV2 (HRNetV2-W48) mIoU 81.1% # 3
Semantic Segmentation Cityscapes val HRNetV2 (HRNetV2-W40) mIoU 80.2% # 8
Object Detection COCO minival Cascade R-CNN (HRNetV2p-W48) box AP 44.6 # 26
AP50 62.7 # 22
AP75 48.7 # 16
APS 26.3 # 17
APM 48.1 # 12
APL 58.5 # 15
Object Detection COCO minival Faster R-CNN (HRNetV2p-W32) box AP 40.9 # 46
AP50 61.8 # 26
AP75 44.8 # 31
APS 24.4 # 26
APM 43.7 # 31
APL 53.3 # 29
Object Detection COCO minival Faster R-CNN (HRNetV2p-W18) box AP 38.0 # 65
AP50 58.9 # 41
AP75 41.5 # 45
APS 22.6 # 33
APM 40.8 # 40
APL 49.6 # 42
Object Detection COCO minival HTC (HRNetV2p-W48) box AP 47.0 # 15
APS 28.8 # 6
APM 50.3 # 6
APL 62.2 # 6
Object Detection COCO minival Mask R-CNN (HRNetV2p-W48, cascade) box AP 46.0 # 19
APS 27.5 # 11
APM 48.9 # 8
APL 60.1 # 9
Object Detection COCO minival HTC (HRNetV2p-W32) box AP 45.3 # 22
APS 27 # 13
APM 48.4 # 10
APL 59.5 # 11
Object Detection COCO minival Mask R-CNN (HRNetV2p-W32, cascade) box AP 44.5 # 27
APS 26.1 # 18
APM 47.9 # 13
APL 58.5 # 15
Object Detection COCO minival HTC (HRNetV2p-W18) box AP 43.1 # 34
APS 26.6 # 16
APM 46 # 21
APL 56.9 # 21
Object Detection COCO minival Mask R-CNN (HRNetV2p-W32) box AP 42.3 # 38
APS 25.0 # 24
APM 45.4 # 23
APL 54.9 # 26
Instance Segmentation COCO minival HTC (HRNetV2p-W48) mask AP 41.0 # 12
Object Detection COCO minival Mask R-CNN (HRNetV2p-W18) box AP 39.2 # 58
APS 23.7 # 29
APM 41.7 # 37
APL 51.0 # 38
Object Detection COCO minival Faster R-CNN (HRNetV2p-W48) box AP 41.8 # 40
AP50 62.8 # 21
AP75 45.9 # 26
APS 25.0 # 24
APM 44.7 # 24
APL 54.6 # 27
Object Detection COCO minival Cascade R-CNN (HRNetV2p-W18) box AP 41.3 # 44
AP50 59.2 # 40
AP75 44.9 # 30
APS 23.7 # 29
APM 44.2 # 28
APL 54.1 # 28
Object Detection COCO test-dev HTC (HRNetV2p-W48) box AP 47.3 # 32
AP50 65.9 # 36
AP75 51.2 # 38
APS 28.0 # 41
APM 49.7 # 40
APL 59.8 # 33
Object Detection COCO test-dev Mask R-CNN (HRNetV2p-W32 + cascade) box AP 44.7 # 49
AP50 62.5 # 59
AP75 48.6 # 52
APS 25.8 # 55
APM 47.1 # 56
APL 56.3 # 51
Object Detection COCO test-dev Cascade R-CNN (HRNetV2p-W48) box AP 44.8 # 48
AP50 63.1 # 55
AP75 48.6 # 52
APS 26.0 # 53
APM 47.3 # 54
APL 56.3 # 51
Object Detection COCO test-dev CenterNet (HRNetV2-W48) box AP 43.5 # 55
AP50 62.1 # 62
AP75 46.5 # 64
APS 22.2 # 75
APM 46.5 # 61
APL 57.8 # 41
Object Detection COCO test-dev FCOS (HRNetV2p-W48) box AP 40.5 # 73
AP50 59.3 # 75
AP75 43.3 # 83
APS 23.4 # 69
APM 42.6 # 78
APL 51.0 # 77
Object Detection COCO test-dev Mask R-CNN (HRNetV2p-W48 + cascade) box AP 46.1 # 40
AP50 64.0 # 49
AP75 50.3 # 45
APS 27.1 # 45
APM 48.6 # 46
APL 58.3 # 38
Object Detection COCO test-dev Faster R-CNN (HRNetV2p-W48) box AP 42.4 # 62
AP50 63.6 # 52
AP75 46.4 # 65
APS 24.9 # 60
APM 44.6 # 70
APL 53.0 # 67
Semantic Segmentation PASCAL Context CFNet (ResNet-101) mIoU 54.0 # 10
Semantic Segmentation PASCAL Context HRNetV2 (HRNetV2-W48) mIoU 54.0 # 10

Methods used in the Paper


METHOD TYPE
Average Pooling
Pooling Operations
Center Pooling
Pooling Operations
Cascade Corner Pooling
Pooling Operations
Cascade R-CNN
Object Detection Models
CenterNet
Object Detection Models
RPN
Region Proposal
RoIPool
RoI Feature Extractors
Faster R-CNN
Object Detection Models
Softmax
Output Functions
RoIAlign
RoI Feature Extractors
Mask R-CNN
Instance Segmentation Models
HRNet
Convolutional Neural Networks
Residual Connection
Skip Connections
ReLU
Activation Functions
1x1 Convolution
Convolutions
Batch Normalization
Normalization
Bottleneck Residual Block
Skip Connection Blocks
Global Average Pooling
Pooling Operations
Residual Block
Skip Connection Blocks
Kaiming Initialization
Initialization
Max Pooling
Pooling Operations
ResNet
Convolutional Neural Networks
Convolution
Convolutions