DCN

Last updated on Feb 23, 2021

Cascade Mask R-CNN DCN (R-101-FPN, 1x, pytorch, dconv(c3-c5))

Memory (M) 8000.0
inference time (s/im) 0.11628
File Size 372.86 MB
Training Data MS COCO
Training Resources 8x NVIDIA V100 GPUs
Training Time

Architecture Softmax, RPN, Convolution, Dense Connections, Deformable Convolution, FPN, ResNet, RoIAlign
lr sched 1x
Memory (M) 8000.0
Backbone Layers 101
inference time (s/im) 0.11628
SHOW MORE
SHOW LESS
Cascade Mask R-CNN DCN (R-50-FPN, 1x, pytorch, dconv(c3-c5))

Memory (M) 6000.0
inference time (s/im) 0.1
File Size 297.46 MB
Training Data MS COCO
Training Resources 8x NVIDIA V100 GPUs
Training Time

Architecture Softmax, RPN, Convolution, Dense Connections, Deformable Convolution, FPN, ResNet, RoIAlign
lr sched 1x
Memory (M) 6000.0
Backbone Layers 50
inference time (s/im) 0.1
SHOW MORE
SHOW LESS
Cascade Mask R-CNN DCN (X-101-32x4d-FPN, 1x, pytorch, dconv(c3-c5))

Memory (M) 9200.0
Backbone Layers 101
File Size 376.47 MB
Training Data MS COCO
Training Resources 8x NVIDIA V100 GPUs
Training Time

Architecture Softmax, RPN, ResNeXt, Convolution, Dense Connections, Deformable Convolution, FPN, RoIAlign
lr sched 1x
Memory (M) 9200.0
Backbone Layers 101
SHOW MORE
SHOW LESS
Cascade R-CNN DCN (R-101-FPN, 1x, pytorch, dconv(c3-c5))

Memory (M) 6400.0
inference time (s/im) 0.09091
File Size 342.60 MB
Training Data MS COCO
Training Resources 8x NVIDIA V100 GPUs
Training Time

Architecture RPN, Deformable Convolution, FPN, Cascade R-CNN, ResNet, RoIAlign
lr sched 1x
Memory (M) 6400.0
Backbone Layers 101
inference time (s/im) 0.09091
SHOW MORE
SHOW LESS
Cascade R-CNN DCN (R-50-FPN, 1x, pytorch, dconv(c3-c5))

Memory (M) 4500.0
inference time (s/im) 0.06849
File Size 267.21 MB
Training Data MS COCO
Training Resources 8x NVIDIA V100 GPUs
Training Time

Architecture RPN, Deformable Convolution, FPN, Cascade R-CNN, ResNet, RoIAlign
lr sched 1x
Memory (M) 4500.0
Backbone Layers 50
inference time (s/im) 0.06849
SHOW MORE
SHOW LESS
Faster R-CNN DCN (R-101-FPN, 1x, pytorch, dconv(c3-c5))

Memory (M) 6000.0
inference time (s/im) 0.08
File Size 237.15 MB
Training Data MS COCO
Training Resources 8x NVIDIA V100 GPUs
Training Time

Architecture Softmax, RPN, Convolution, Deformable Convolution, FPN, RoIPool, ResNet
lr sched 1x
Memory (M) 6000.0
Backbone Layers 101
inference time (s/im) 0.08
SHOW MORE
SHOW LESS
Faster R-CNN DCN (R-50-FPN, 1x, pytorch, dconv(c3-c5))

Memory (M) 4000.0
inference time (s/im) 0.05618
File Size 161.76 MB
Training Data MS COCO
Training Resources 8x NVIDIA V100 GPUs
Training Time

Architecture Softmax, RPN, Convolution, Deformable Convolution, FPN, RoIPool, ResNet
lr sched 1x
Memory (M) 4000.0
Backbone Layers 50
inference time (s/im) 0.05618
SHOW MORE
SHOW LESS
Faster R-CNN DCN (R-50-FPN, 1x, pytorch, dpool)

Memory (M) 5000.0
inference time (s/im) 0.05814
File Size 373.11 MB
Training Data MS COCO
Training Resources 8x NVIDIA V100 GPUs
Training Time

Architecture Softmax, RPN, Convolution, Deformable Convolution, FPN, RoIPool, ResNet
lr sched 1x
Memory (M) 5000.0
Backbone Layers 50
inference time (s/im) 0.05814
SHOW MORE
SHOW LESS
Faster R-CNN DCN (R-50-FPN, 1x, pytorch, mdconv(c3-c5))

Memory (M) 4100.0
inference time (s/im) 0.05682
File Size 162.87 MB
Training Data MS COCO
Training Resources 8x NVIDIA V100 GPUs
Training Time

Architecture Softmax, RPN, Convolution, Deformable Convolution, FPN, RoIPool, ResNet
lr sched 1x
Memory (M) 4100.0
Backbone Layers 50
inference time (s/im) 0.05682
SHOW MORE
SHOW LESS
Faster R-CNN DCN (R-50-FPN, 1x, pytorch, mdpool)

Memory (M) 5800.0
inference time (s/im) 0.06024
File Size 569.89 MB
Training Data MS COCO
Training Resources 8x NVIDIA V100 GPUs
Training Time

Architecture Softmax, RPN, Convolution, Deformable Convolution, FPN, RoIPool, ResNet
lr sched 1x
Memory (M) 5800.0
Backbone Layers 50
inference time (s/im) 0.06024
SHOW MORE
SHOW LESS
Faster R-CNN DCN (*R-50-FPN (dg=4), 1x, pytorch, mdconv(c3-c5))

Memory (M) 4200.0
inference time (s/im) 0.05747
File Size 172.84 MB
Training Data MS COCO
Training Resources 8x NVIDIA V100 GPUs
Training Time

Architecture Softmax, RPN, Convolution, Deformable Convolution, FPN, RoIPool, ResNet
lr sched 1x
Memory (M) 4200.0
Backbone Layers 50
inference time (s/im) 0.05747
SHOW MORE
SHOW LESS
Faster R-CNN DCN (X-101-32x4d-FPN, 1x, pytorch, dconv(c3-c5))

Memory (M) 7300.0
inference time (s/im) 0.1
File Size 240.76 MB
Training Data MS COCO
Training Resources 8x NVIDIA V100 GPUs
Training Time

Architecture Softmax, RPN, ResNeXt, Convolution, Deformable Convolution, FPN, RoIPool
lr sched 1x
Memory (M) 7300.0
Backbone Layers 101
inference time (s/im) 0.1
SHOW MORE
SHOW LESS
Mask R-CNN DCN (R-101-FPN, 1x, pytorch, dconv(c3-c5))

Memory (M) 6500.0
inference time (s/im) 0.08547
File Size 247.24 MB
Training Data MS COCO
Training Resources 8x NVIDIA V100 GPUs
Training Time

Architecture Softmax, RPN, Convolution, Dense Connections, Deformable Convolution, FPN, ResNet, RoIAlign
lr sched 1x
Memory (M) 6500.0
Backbone Layers 101
inference time (s/im) 0.08547
SHOW MORE
SHOW LESS
Mask R-CNN DCN (R-50-FPN, 1x, pytorch, dconv(c3-c5))

Memory (M) 4500.0
inference time (s/im) 0.06494
File Size 171.84 MB
Training Data MS COCO
Training Resources 8x NVIDIA V100 GPUs
Training Time

Architecture Softmax, RPN, Convolution, Dense Connections, Deformable Convolution, FPN, ResNet, RoIAlign
lr sched 1x
Memory (M) 4500.0
Backbone Layers 50
inference time (s/im) 0.06494
SHOW MORE
SHOW LESS
Mask R-CNN DCN (R-50-FPN, 1x, pytorch, mdconv(c3-c5))

Memory (M) 4500.0
inference time (s/im) 0.06623
File Size 172.95 MB
Training Data MS COCO
Training Resources 8x NVIDIA V100 GPUs
Training Time

Architecture Softmax, RPN, Convolution, Dense Connections, Deformable Convolution, FPN, ResNet, RoIAlign
lr sched 1x
Memory (M) 4500.0
Backbone Layers 50
inference time (s/im) 0.06623
SHOW MORE
SHOW LESS
README.md

Deformable Convolutional Networks

Introduction

[ALGORITHM]

@inproceedings{dai2017deformable,
  title={Deformable Convolutional Networks},
  author={Dai, Jifeng and Qi, Haozhi and Xiong, Yuwen and Li, Yi and Zhang, Guodong and Hu, Han and Wei, Yichen},
  booktitle={Proceedings of the IEEE international conference on computer vision},
  year={2017}
}

[ALGORITHM]

@article{zhu2018deformable,
  title={Deformable ConvNets v2: More Deformable, Better Results},
  author={Zhu, Xizhou and Hu, Han and Lin, Stephen and Dai, Jifeng},
  journal={arXiv preprint arXiv:1811.11168},
  year={2018}
}

Results and Models

Backbone Model Style Conv Pool Lr schd Mem (GB) Inf time (fps) box AP mask AP Config Download
R-50-FPN Faster pytorch dconv(c3-c5) - 1x 4.0 17.8 41.3 config model | log
R-50-FPN Faster pytorch mdconv(c3-c5) - 1x 4.1 17.6 41.4 config model | log
*R-50-FPN (dg=4) Faster pytorch mdconv(c3-c5) - 1x 4.2 17.4 41.5 config model | log
R-50-FPN Faster pytorch - dpool 1x 5.0 17.2 38.9 config model | log
R-50-FPN Faster pytorch - mdpool 1x 5.8 16.6 38.7 config model | log
R-101-FPN Faster pytorch dconv(c3-c5) - 1x 6.0 12.5 42.7 config model | log
X-101-32x4d-FPN Faster pytorch dconv(c3-c5) - 1x 7.3 10.0 44.5 config model | log
R-50-FPN Mask pytorch dconv(c3-c5) - 1x 4.5 15.4 41.8 37.4 config model | log
R-50-FPN Mask pytorch mdconv(c3-c5) - 1x 4.5 15.1 41.5 37.1 config model | log
R-101-FPN Mask pytorch dconv(c3-c5) - 1x 6.5 11.7 43.5 38.9 config model | log
R-50-FPN Cascade pytorch dconv(c3-c5) - 1x 4.5 14.6 43.8 config model | log
R-101-FPN Cascade pytorch dconv(c3-c5) - 1x 6.4 11.0 45.0 config model | log
R-50-FPN Cascade Mask pytorch dconv(c3-c5) - 1x 6.0 10.0 44.4 38.6 config model | log
R-101-FPN Cascade Mask pytorch dconv(c3-c5) - 1x 8.0 8.6 45.8 39.7 config model | log
X-101-32x4d-FPN Cascade Mask pytorch dconv(c3-c5) - 1x 9.2 47.3 41.1 config model | log

Notes:

  • dconv and mdconv denote (modulated) deformable convolution, c3-c5 means adding dconv in resnet stage 3 to 5. dpool and mdpool denote (modulated) deformable roi pooling.
  • The dcn ops are modified from https://github.com/chengdazhi/Deformable-Convolution-V2-PyTorch, which should be more memory efficient and slightly faster.
  • (*) For R-50-FPN (dg=4), dg is short for deformable_group. This model is trained and tested on Amazon EC2 p3dn.24xlarge instance.
  • Memory, Train/Inf time is outdated.

Results

Object Detection on COCO minival

Object Detection on COCO minival
MODEL BOX AP
Cascade Mask R-CNN DCN (X-101-32x4d-FPN, 1x, pytorch, dconv(c3-c5)) 47.3
Cascade Mask R-CNN DCN (R-101-FPN, 1x, pytorch, dconv(c3-c5)) 45.8
Cascade R-CNN DCN (R-101-FPN, 1x, pytorch, dconv(c3-c5)) 45.0
Faster R-CNN DCN (X-101-32x4d-FPN, 1x, pytorch, dconv(c3-c5)) 44.5
Cascade Mask R-CNN DCN (R-50-FPN, 1x, pytorch, dconv(c3-c5)) 44.4
Cascade R-CNN DCN (R-50-FPN, 1x, pytorch, dconv(c3-c5)) 43.8
Mask R-CNN DCN (R-101-FPN, 1x, pytorch, dconv(c3-c5)) 43.5
Faster R-CNN DCN (R-101-FPN, 1x, pytorch, dconv(c3-c5)) 42.7
Mask R-CNN DCN (R-50-FPN, 1x, pytorch, dconv(c3-c5)) 41.8
Faster R-CNN DCN (*R-50-FPN (dg=4), 1x, pytorch, mdconv(c3-c5)) 41.5
Mask R-CNN DCN (R-50-FPN, 1x, pytorch, mdconv(c3-c5)) 41.5
Faster R-CNN DCN (R-50-FPN, 1x, pytorch, mdconv(c3-c5)) 41.4
Faster R-CNN DCN (R-50-FPN, 1x, pytorch, dconv(c3-c5)) 41.3
Faster R-CNN DCN (R-50-FPN, 1x, pytorch, dpool) 38.9
Faster R-CNN DCN (R-50-FPN, 1x, pytorch, mdpool) 38.7
Instance Segmentation on COCO minival
MODEL MASK AP
Cascade Mask R-CNN DCN (X-101-32x4d-FPN, 1x, pytorch, dconv(c3-c5)) 41.1
Cascade Mask R-CNN DCN (R-101-FPN, 1x, pytorch, dconv(c3-c5)) 39.7
Mask R-CNN DCN (R-101-FPN, 1x, pytorch, dconv(c3-c5)) 38.9
Cascade Mask R-CNN DCN (R-50-FPN, 1x, pytorch, dconv(c3-c5)) 38.6
Mask R-CNN DCN (R-50-FPN, 1x, pytorch, dconv(c3-c5)) 37.4
Mask R-CNN DCN (R-50-FPN, 1x, pytorch, mdconv(c3-c5)) 37.1