Cascade R-CNN: Delving into High Quality Object Detection

CVPR 2018  ·  Zhaowei Cai, Nuno Vasconcelos ·

In object detection, an intersection over union (IoU) threshold is required to define positives and negatives. An object detector, trained with low IoU threshold, e.g. 0.5, usually produces noisy detections. However, detection performance tends to degrade with increasing the IoU thresholds. Two main factors are responsible for this: 1) overfitting during training, due to exponentially vanishing positive samples, and 2) inference-time mismatch between the IoUs for which the detector is optimal and those of the input hypotheses. A multi-stage object detection architecture, the Cascade R-CNN, is proposed to address these problems. It consists of a sequence of detectors trained with increasing IoU thresholds, to be sequentially more selective against close false positives. The detectors are trained stage by stage, leveraging the observation that the output of a detector is a good distribution for training the next higher quality detector. The resampling of progressively improved hypotheses guarantees that all detectors have a positive set of examples of equivalent size, reducing the overfitting problem. The same cascade procedure is applied at inference, enabling a closer match between the hypotheses and the detector quality of each stage. A simple implementation of the Cascade R-CNN is shown to surpass all single-model object detectors on the challenging COCO dataset. Experiments also show that the Cascade R-CNN is widely applicable across detector architectures, achieving consistent gains independently of the baseline detector strength. The code will be made available at https://github.com/zhaoweicai/cascade-rcnn.

PDF Abstract CVPR 2018 PDF CVPR 2018 Abstract

Results from the Paper


Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Object Detection AI-TOD Cascade R-CNN (ResNet-50-FPN) AP 13.8 # 4
AP50 30.8 # 4
AP75 10.5 # 3
APvt 0.0 # 3
APt 10.6 # 4
APs 25.5 # 3
APm 26.6 # 3
Object Detection COCO minival Cascade R-CNN (ResNet-50-FPN+) box AP 40.3 # 164
AP50 59.4 # 82
AP75 43.7 # 73
APS 22.9 # 63
APM 43.7 # 61
APL 54.1 # 60
Object Detection COCO minival Cascade R-CNN (ResNet-101-FPN+, cascade) box AP 42.7 # 136
AP50 61.6 # 65
AP75 46.6 # 50
APS 23.8 # 57
APM 46.2 # 42
APL 57.4 # 47
Object Detection COCO test-dev Cascade R-CNN (ResNet-101-FPN+, cascade) box mAP 42.8 # 151
AP50 62.1 # 106
AP75 46.3 # 104
APS 23.7 # 102
APM 45.5 # 98
APL 55.2 # 92
Hardware Burden None # 1
Operations per network pass None # 1
Object Detection COCO test-dev Cascade R-CNN (ResNet-50-FPN+, cascade) box mAP 40.6 # 172
AP50 59.9 # 124
AP75 44 # 124
APS 22.6 # 113
APM 42.7 # 119
APL 52.1 # 116
Hardware Burden 12G # 1
Operations per network pass None # 1
Object Detection COCO test-dev Cascade R-CNN (ResNet-50-FPN+) box mAP 36.5 # 204
AP50 59 # 133
AP75 39.2 # 143
APS 20.3 # 129
APM 38.8 # 134
APL 46.4 # 136
Hardware Burden 3G # 1
Operations per network pass None # 1
Object Detection COCO test-dev Cascade R-CNN (ResNet-101-FPN+) box mAP 38.8 # 192
AP50 61.1 # 113
AP75 41.9 # 134
APS 21.3 # 125
APM 41.8 # 126
APL 49.8 # 133
Hardware Burden 3G # 1
Operations per network pass None # 1
2D Object Detection SARDet-100K Cascade R-CNN box mAP 51.1 # 4

Methods