HRNet, or High-Resolution Net, is a general purpose convolutional neural network for tasks like semantic segmentation, object detection and image classification. It is able to maintain high resolution representations through the whole process. We start from a high-resolution convolution stream, gradually add high-to-low resolution convolution streams one by one, and connect the multi-resolution streams in parallel. The resulting network consists of several ($4$ in the paper) stages and the $n$th stage contains $n$ streams corresponding to $n$ resolutions. The authors conduct repeated multi-resolution fusions by exchanging the information across the parallel streams over and over.
Source: Deep High-Resolution Representation Learning for Visual RecognitionPaper | Code | Results | Date | Stars |
---|
Task | Papers | Share |
---|---|---|
Pose Estimation | 19 | 16.52% |
Semantic Segmentation | 18 | 15.65% |
Image Classification | 5 | 4.35% |
2D Human Pose Estimation | 5 | 4.35% |
Multi-Person Pose Estimation | 4 | 3.48% |
Depth Estimation | 3 | 2.61% |
Autonomous Driving | 2 | 1.74% |
Monocular Depth Estimation | 2 | 1.74% |
3D Human Pose Estimation | 2 | 1.74% |
Component | Type |
|
---|---|---|
![]() |
Normalization | |
![]() |
Convolutions | |
![]() |
Activation Functions | |
![]() |
Skip Connections |