A novel Region of Interest Extraction Layer for Instance Segmentation

28 Apr 2020  ·  Leonardo Rossi, Akbar Karimi, Andrea Prati ·

Given the wide diffusion of deep neural network architectures for computer vision tasks, several new applications are nowadays more and more feasible. Among them, a particular attention has been recently given to instance segmentation, by exploiting the results achievable by two-stage networks (such as Mask R-CNN or Faster R-CNN), derived from R-CNN. In these complex architectures, a crucial role is played by the Region of Interest (RoI) extraction layer, devoted to extracting a coherent subset of features from a single Feature Pyramid Network (FPN) layer attached on top of a backbone. This paper is motivated by the need to overcome the limitations of existing RoI extractors which select only one (the best) layer from FPN. Our intuition is that all the layers of FPN retain useful information. Therefore, the proposed layer (called Generic RoI Extractor - GRoIE) introduces non-local building blocks and attention mechanisms to boost the performance. A comprehensive ablation study at component level is conducted to find the best set of algorithms and parameters for the GRoIE layer. Moreover, GRoIE can be integrated seamlessly with every two-stage architecture for both object detection and instance segmentation tasks. Therefore, the improvements brought about by the use of GRoIE in different state-of-the-art architectures are also evaluated. The proposed layer leads up to gain a 1.1% AP improvement on bounding box detection and 1.7% AP improvement on instance segmentation. The code is publicly available on GitHub repository at https://github.com/IMPLabUniPr/mmdetection/tree/groie_dev

PDF Abstract


Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Instance Segmentation COCO minival GCnet (ResNet-50-FPN, GRoIE) mask AP 37.2 # 82
AP50 59.3 # 14
AP75 39.8 # 16
APL 51.2 # 11
APM 41 # 9
APS 20.2 # 9
Object Detection COCO minival Mask R-CNN (ResNet-50-FPN, GRoIE) box AP 38.4 # 183
AP50 59.9 # 78
AP75 41.7 # 81
APS 22.9 # 63
APM 42.1 # 69
APL 49.7 # 77
Instance Segmentation COCO minival Mask R-CNN (ResNet-50-FPN, GRoIE) mask AP 35.8 # 86
AP50 57.1 # 18
AP75 38.0 # 17
APL 48.7 # 12
APM 39 # 12
APS 19.1 # 11
Object Detection COCO minival Faster R-CNN (ResNet-50-FPN, GRoIE) box AP 37.5 # 190
AP50 59.2 # 85
AP75 40.6 # 88
APS 22.3 # 69
APM 41.5 # 72
APL 47.8 # 82