RetinaNet

Last updated on Feb 12, 2021

RetinaNet ResNet-50 FPN

Parameters 34 Million
FLOPs 527 Billion
File Size 130.27 MB
Training Data MS COCO
Training Resources 8x NVIDIA V100 GPUs
Training Time

Training Techniques Weight Decay, SGD with Momentum, Focal Loss
Architecture FPN, 1x1 Convolution, Bottleneck Residual Block, Batch Normalization, Convolution, Global Average Pooling, Residual Block, Residual Connection, ReLU, Max Pooling, Softmax, Non Maximum Suppression
ID retinanet_resnet50_fpn
LR 0.01
Epochs 26
LR Steps 16, 22
Momentum 0.9
Batch Size 2
Memory (GB) 4.1
LR Step Size 8
Weight Decay 0.0001
train time (s/im) 0.2514
inference time (s/im) 0.0939
Aspect Ratio Group Factor 3
SHOW MORE
SHOW LESS
README.md

Summary

RetinaNet is a one-stage object detection model that utilizes a focal loss function to address class imbalance during training. Focal loss applies a modulating term to the cross entropy loss in order to focus learning on hard negative examples. RetinaNet is a single, unified network composed of a backbone network and two task-specific subnetworks. The backbone is responsible for computing a convolutional feature map over an entire input image and is an off-the-self convolutional network. The first subnet performs convolutional object classification on the backbone's output; the second subnet performs convolutional bounding box regression. The two subnetworks feature a simple design that the authors propose specifically for one-stage, dense detection.

How do I load this model?

To load a pretrained model:

import torchvision.models as models
retinanet_resnet50_fpn = models.detection.retinanet_resnet50_fpn(pretrained=True)

Replace the model name with the variant you want to use, e.g. retinanet_resnet50_fpn. You can find the IDs in the model summaries at the top of this page.

To evaluate the model, use the object detection recipes from the library.

How do I train this model?

You can follow the torchvision recipe on GitHub for training a new model afresh.

Citation

@article{DBLP:journals/corr/abs-1708-02002,
  author    = {Tsung{-}Yi Lin and
               Priya Goyal and
               Ross B. Girshick and
               Kaiming He and
               Piotr Doll{\'{a}}r},
  title     = {Focal Loss for Dense Object Detection},
  journal   = {CoRR},
  volume    = {abs/1708.02002},
  year      = {2017},
  url       = {http://arxiv.org/abs/1708.02002},
  archivePrefix = {arXiv},
  eprint    = {1708.02002},
  timestamp = {Mon, 13 Aug 2018 16:46:12 +0200},
  biburl    = {https://dblp.org/rec/journals/corr/abs-1708-02002.bib},
  bibsource = {dblp computer science bibliography, https://dblp.org}
}

Results

Object Detection on COCO minival

Object Detection
BENCHMARK MODEL METRIC NAME METRIC VALUE GLOBAL RANK
COCO minival RetinaNet ResNet-50 FPN box AP 36.4 # 108