DAMO-YOLO : A Report on Real-Time Object Detection Design

23 Nov 2022  ·  Xianzhe Xu, Yiqi Jiang, Weihua Chen, Yilun Huang, Yuan Zhang, Xiuyu Sun ·

In this report, we present a fast and accurate object detection method dubbed DAMO-YOLO, which achieves higher performance than the state-of-the-art YOLO series. DAMO-YOLO is extended from YOLO with some new technologies, including Neural Architecture Search (NAS), efficient Reparameterized Generalized-FPN (RepGFPN), a lightweight head with AlignedOTA label assignment, and distillation enhancement. In particular, we use MAE-NAS, a method guided by the principle of maximum entropy, to search our detection backbone under the constraints of low latency and high performance, producing ResNet-like / CSP-like structures with spatial pyramid pooling and focus modules. In the design of necks and heads, we follow the rule of "large neck, small head". We import Generalized-FPN with accelerated queen-fusion to build the detector neck and upgrade its CSPNet with efficient layer aggregation networks (ELAN) and reparameterization. Then we investigate how detector head size affects detection performance and find that a heavy neck with only one task projection layer would yield better results. In addition, AlignedOTA is proposed to solve the misalignment problem in label assignment. And a distillation schema is introduced to improve performance to a higher level. Based on these new techs, we build a suite of models at various scales to meet the needs of different scenarios, i.e., DAMO-YOLO-Tiny/Small/Medium. They can achieve 43.0/46.8/50.0 mAPs on COCO with the latency of 2.78/3.83/5.62 ms on T4 GPUs respectively. The code is available at https://github.com/tinyvision/damo-yolo.

PDF Abstract


Results from the Paper

Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Real-Time Object Detection COCO DAMO-YOLO-M FPS (V100, b=1) 233 # 4
box AP 50.0 # 12
FPS 233 # 4
Real-Time Object Detection COCO DAMO-YOLO-T FPS (V100, b=1) 397 # 1
box AP 43.0 # 15
FPS 397 # 1
Real-Time Object Detection COCO DAMO-YOLO-S FPS (V100, b=1) 325 # 2
box AP 46.8 # 13
FPS 325 # 2
Object Detection COCO test-dev DAMO-YOLO-M box AP 50.0 # 86
AP50 66.8 # 71
AP75 54.6 # 48
APS 30.4 # 49
APM 54.8 # 31
APL 67.6 # 14