Libra R-CNN: Towards Balanced Learning for Object Detection

Compared with model architectures, the training process, which is also crucial to the success of detectors, has received relatively less attention in object detection. In this work, we carefully revisit the standard training practice of detectors, and find that the detection performance is often limited by the imbalance during the training process, which generally consists in three levels - sample level, feature level, and objective level. To mitigate the adverse effects caused thereby, we propose Libra R-CNN, a simple but effective framework towards balanced learning for object detection. It integrates three novel components: IoU-balanced sampling, balanced feature pyramid, and balanced L1 loss, respectively for reducing the imbalance at sample, feature, and objective level. Benefitted from the overall balanced design, Libra R-CNN significantly improves the detection performance. Without bells and whistles, it achieves 2.5 points and 2.0 points higher Average Precision (AP) than FPN Faster R-CNN and RetinaNet respectively on MSCOCO.

PDF Abstract CVPR 2019 PDF CVPR 2019 Abstract


Results from the Paper

Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Object Detection COCO minival Libra R-CNN (ResNet-50 FPN) box AP 38.5 # 149
AP50 59.3 # 77
AP75 42.0 # 75
APS 22.9 # 60
APM 42.1 # 66
APL 50.5 # 67
Object Detection COCO test-dev Libra R-CNN (ResNeXt-101-FPN) box AP 43.0 # 132
AP50 64 # 92
AP75 47 # 97
APS 25.3 # 96
APM 45.6 # 106
APL 54.6 # 108
Hardware Burden None # 1
Operations per network pass None # 1