Mask Scoring R-CNN

Last updated on Feb 23, 2021

Mask Scoring R-CNN (R-101-FPN, 1x)

Memory (M) 6500.0
Backbone Layers 101
File Size 304.66 MB
Training Data MS COCO
Training Resources 8x NVIDIA V100 GPUs
Training Time

Architecture Softmax, RPN, Convolution, Max Pooling, Dense Connections, FPN, ResNet, RoIAlign
lr sched 1x
Memory (M) 6500.0
Backbone Layers 101
SHOW MORE
SHOW LESS
Mask Scoring R-CNN (R-101-FPN, 2x)

lr sched 2x
Backbone Layers 101
File Size 304.66 MB
Training Data MS COCO
Training Resources 8x NVIDIA V100 GPUs
Training Time

Architecture Softmax, RPN, Convolution, Max Pooling, Dense Connections, FPN, ResNet, RoIAlign
lr sched 2x
Backbone Layers 101
SHOW MORE
SHOW LESS
Mask Scoring R-CNN (R-50-FPN, 1x)

Memory (M) 4500.0
Backbone Layers 50
File Size 231.96 MB
Training Data MS COCO
Training Resources 8x NVIDIA V100 GPUs
Training Time

Architecture Softmax, RPN, Convolution, Max Pooling, Dense Connections, FPN, ResNet, RoIAlign
lr sched 1x
Memory (M) 4500.0
Backbone Layers 50
SHOW MORE
SHOW LESS
Mask Scoring R-CNN (R-50-FPN, 2x)

lr sched 2x
Backbone Layers 50
File Size 231.96 MB
Training Data MS COCO
Training Resources 8x NVIDIA V100 GPUs
Training Time

Architecture Softmax, RPN, Convolution, Max Pooling, Dense Connections, FPN, ResNet, RoIAlign
lr sched 2x
Backbone Layers 50
SHOW MORE
SHOW LESS
Mask Scoring R-CNN (R-X101-32x4d, 2x)

Memory (M) 7900.0
inference time (s/im) 0.09091
File Size 303.36 MB
Training Data MS COCO
Training Resources 8x NVIDIA V100 GPUs
Training Time

Architecture Softmax, RPN, ResNeXt, Convolution, Max Pooling, Dense Connections, RoIAlign
lr sched 2x
Memory (M) 7900.0
inference time (s/im) 0.09091
SHOW MORE
SHOW LESS
Mask Scoring R-CNN (R-X101-64x4d, 1x)

Memory (M) 11000.0
inference time (s/im) 0.125
File Size 453.44 MB
Training Data MS COCO
Training Resources 8x NVIDIA V100 GPUs
Training Time

Architecture Softmax, RPN, ResNeXt, Convolution, Max Pooling, Dense Connections, RoIAlign
lr sched 1x
Memory (M) 11000.0
inference time (s/im) 0.125
SHOW MORE
SHOW LESS
Mask Scoring R-CNN (R-X101-64x4d, 2x)

Memory (M) 11000.0
inference time (s/im) 0.125
File Size 453.44 MB
Training Data MS COCO
Training Resources 8x NVIDIA V100 GPUs
Training Time

Architecture Softmax, RPN, ResNeXt, Convolution, Max Pooling, Dense Connections, RoIAlign
lr sched 2x
Memory (M) 11000.0
inference time (s/im) 0.125
SHOW MORE
SHOW LESS
README.md

Mask Scoring R-CNN

Introduction

[ALGORITHM]

@inproceedings{huang2019msrcnn,
    title={Mask Scoring R-CNN},
    author={Zhaojin Huang and Lichao Huang and Yongchao Gong and Chang Huang and Xinggang Wang},
    booktitle={IEEE Conference on Computer Vision and Pattern Recognition},
    year={2019},
}

Results and Models

Backbone style Lr schd Mem (GB) Inf time (fps) box AP mask AP Config Download
R-50-FPN caffe 1x 4.5 38.2 36.0 config model | log
R-50-FPN caffe 2x - - 38.8 36.3 config model | log
R-101-FPN caffe 1x 6.5 40.4 37.6 config model | log
R-101-FPN caffe 2x - - 41.1 38.1 config model | log
R-X101-32x4d pytorch 2x 7.9 11.0 41.8 38.7 config model | log
R-X101-64x4d pytorch 1x 11.0 8.0 43.0 39.5 config model | log
R-X101-64x4d pytorch 2x 11.0 8.0 42.6 39.5 config model | log

Results

Object Detection on COCO minival

Object Detection on COCO minival
MODEL BOX AP
Mask Scoring R-CNN (R-X101-64x4d, 1x) 43.0
Mask Scoring R-CNN (R-X101-64x4d, 2x) 42.6
Mask Scoring R-CNN (R-X101-32x4d, 2x) 41.8
Mask Scoring R-CNN (R-101-FPN, 2x) 41.1
Mask Scoring R-CNN (R-101-FPN, 1x) 40.4
Mask Scoring R-CNN (R-50-FPN, 2x) 38.8
Mask Scoring R-CNN (R-50-FPN, 1x) 38.2
Instance Segmentation on COCO minival
MODEL MASK AP
Mask Scoring R-CNN (R-X101-64x4d, 1x) 39.5
Mask Scoring R-CNN (R-X101-64x4d, 2x) 39.5
Mask Scoring R-CNN (R-X101-32x4d, 2x) 38.7
Mask Scoring R-CNN (R-101-FPN, 2x) 38.1
Mask Scoring R-CNN (R-101-FPN, 1x) 37.6
Mask Scoring R-CNN (R-50-FPN, 2x) 36.3
Mask Scoring R-CNN (R-50-FPN, 1x) 36.0