PISA

Last updated on Feb 23, 2021

Faster R-CNN PISA (R-50-FPN, 1x)

lr sched 1x
Backbone Layers 50
File Size 159.54 MB
Training Data MS COCO
Training Resources 8x NVIDIA V100 GPUs
Training Time

Architecture Softmax, RPN, Convolution, FPN, RoIPool, PISA, ResNet
lr sched 1x
Backbone Layers 50
SHOW MORE
SHOW LESS
Faster R-CNN PISA (X101-32x4d-FPN, 1x)

lr sched 1x
FLOPs
File Size 230.94 MB
Training Data MS COCO
Training Resources 8x NVIDIA V100 GPUs
Training Time

Architecture Softmax, RPN, ResNeXt, Convolution, FPN, RoIPool, PISA
lr sched 1x
Mask R-CNN PISA (R-50-FPN, 1x)

lr sched 1x
Backbone Layers 50
File Size 169.63 MB
Training Data MS COCO
Training Resources 8x NVIDIA V100 GPUs
Training Time

Architecture Softmax, RPN, Convolution, Dense Connections, FPN, PISA, ResNet, RoIAlign
lr sched 1x
Backbone Layers 50
SHOW MORE
SHOW LESS
Mask R-CNN PISA (X101-32x4d-FPN, 1x)

This model lacks metadata!


RetinaNet PISA (R-50-FPN, 1x)

lr sched 1x
Backbone Layers 50
File Size 145.10 MB
Training Data MS COCO
Training Resources 8x NVIDIA V100 GPUs
Training Time

Architecture PISA, ResNet, FPN, Focal Loss
lr sched 1x
Backbone Layers 50
SHOW MORE
SHOW LESS
RetinaNet PISA (X101-32x4d-FPN, 1x)

lr sched 1x
FLOPs
File Size 216.51 MB
Training Data MS COCO
Training Resources 8x NVIDIA V100 GPUs
Training Time

Architecture PISA, ResNeXt, FPN, Focal Loss
lr sched 1x
SSD300 PISA (VGG16, 1x)

lr sched 1x
FLOPs
File Size 130.88 MB
Training Data MS COCO
Training Resources 8x NVIDIA V100 GPUs
Training Time

Architecture PISA, SSD, Non-Maximum Suppression
lr sched 1x
README.md

Prime Sample Attention in Object Detection

Introduction

[ALGORITHM]

@inproceedings{cao2019prime,
  title={Prime sample attention in object detection},
  author={Cao, Yuhang and Chen, Kai and Loy, Chen Change and Lin, Dahua},
  booktitle={IEEE Conference on Computer Vision and Pattern Recognition},
  year={2020}
}

Results and models

PISA Network Backbone Lr schd box AP mask AP Config Download
× Faster R-CNN R-50-FPN 1x 36.4 -
Faster R-CNN R-50-FPN 1x 38.4 config model | log
× Faster R-CNN X101-32x4d-FPN 1x 40.1 -
Faster R-CNN X101-32x4d-FPN 1x 41.9 config model | log
× Mask R-CNN R-50-FPN 1x 37.3 34.2 -
Mask R-CNN R-50-FPN 1x 39.1 35.2 config model | log
× Mask R-CNN X101-32x4d-FPN 1x 41.1 37.1 -
Mask R-CNN X101-32x4d-FPN 1x
× RetinaNet R-50-FPN 1x 35.6 -
RetinaNet R-50-FPN 1x 36.9 config model | log
× RetinaNet X101-32x4d-FPN 1x 39.0 -
RetinaNet X101-32x4d-FPN 1x 40.7 config model | log
× SSD300 VGG16 1x 25.6 -
SSD300 VGG16 1x 27.6 config model | log
× SSD300 VGG16 1x 29.3 -
SSD300 VGG16 1x 31.8 config model | log

Notes:

  • In the original paper, all models are trained and tested on mmdet v1.x, thus results may not be exactly the same with this release on v2.0.
  • It is noted PISA only modifies the training pipeline so the inference time remains the same with the baseline.

Results

Object Detection on COCO minival
MODEL BOX AP
Faster R-CNN PISA (X101-32x4d-FPN, 1x) 41.9
RetinaNet PISA (X101-32x4d-FPN, 1x) 40.7
Mask R-CNN PISA (R-50-FPN, 1x) 39.1
Faster R-CNN PISA (R-50-FPN, 1x) 38.4
RetinaNet PISA (R-50-FPN, 1x) 36.9
SSD300 PISA (VGG16, 1x) 31.8