RTMDet: An Empirical Study of Designing Real-Time Object Detectors

14 Dec 2022  ยท  Chengqi Lyu, Wenwei Zhang, Haian Huang, Yue Zhou, Yudong Wang, Yanyi Liu, Shilong Zhang, Kai Chen ยท

In this paper, we aim to design an efficient real-time object detector that exceeds the YOLO series and is easily extensible for many object recognition tasks such as instance segmentation and rotated object detection. To obtain a more efficient model architecture, we explore an architecture that has compatible capacities in the backbone and neck, constructed by a basic building block that consists of large-kernel depth-wise convolutions. We further introduce soft labels when calculating matching costs in the dynamic label assignment to improve accuracy. Together with better training techniques, the resulting object detector, named RTMDet, achieves 52.8% AP on COCO with 300+ FPS on an NVIDIA 3090 GPU, outperforming the current mainstream industrial detectors. RTMDet achieves the best parameter-accuracy trade-off with tiny/small/medium/large/extra-large model sizes for various application scenarios, and obtains new state-of-the-art performance on real-time instance segmentation and rotated object detection. We hope the experimental results can provide new insights into designing versatile real-time object detectors for many object recognition tasks. Code and models are released at https://github.com/open-mmlab/mmdetection/tree/3.x/configs/rtmdet.

PDF Abstract

Results from the Paper


Task Dataset Model Metric Name Metric Value Global Rank Uses Extra
Training Data
Result Benchmark
Object Detection In Aerial Images DOTA RTMDet-R-l mAP 81.33% # 7
Object Detection In Aerial Images DOTA RTMDet-R-l (single scale) mAP 80.16% # 17
Oriented Object Detection DOTA 1.0 RTMDet-R-l mAP 81.33 # 5
Oriented Object Detection DOTA 1.5 RTMDet-R-l mAP 78.12 # 1
One-stage Anchor-free Oriented Object Detection HRSC2016 RTMDet-R-tiny mAP 90.60 # 1
Object Detection In Aerial Images HRSC2016 RTMDet-R-tiny mAP-07 90.6 # 3
mAP-12 97.10 # 3
Real-time Instance Segmentation MSCOCO RTMDet-Ins-m Frame (fps) 371 (RTX3090) # 18
mask AP 42.1 # 3
AP50 63.9 # 3
AP75 45.1 # 3
APS 19.3 # 3
APM 46.4 # 3
APL 63.1 # 3
Real-time Instance Segmentation MSCOCO RTMDet-Ins-x Frame (fps) 188 (RTX3090) # 18
mask AP 44.6 # 1
AP50 67.4 # 1
AP75 47.8 # 1
APS 22.2 # 1
APM 49.0 # 1
APL 65.5 # 1
Real-time Instance Segmentation MSCOCO RTMDet-Ins-s Frame (fps) 518 (RTX3090) # 18
mask AP 38.7 # 7
AP50 59.3 # 5
AP75 41.3 # 4
APS 15.1 # 6
APM 42.3 # 4
APL 60.3 # 4
Real-time Instance Segmentation MSCOCO RTMDet-Ins-l Frame (fps) 271 (RTX3090) # 18
mask AP 43.7 # 2
AP50 66 # 2
AP75 47.0 # 2
APS 20.8 # 2
APM 48.0 # 2
APL 64.8 # 2
Real-Time Object Detection MS COCO YOLOv7 box AP 52.8 # 24

Methods