FoveaBox

Last updated on Feb 23, 2021

FoveaBox (R-101, 1x, pytorch, MS train=N, align=N)

Memory (M) 9200.0
inference time (s/im) 0.05747
File Size 211.89 MB
Training Data MS COCO
Training Resources 8x NVIDIA V100 GPUs
Training Time

Architecture ResNet, FPN, Focal Loss
MS train N
lr sched 1x
Memory (M) 9200.0
Backbone Layers 101
inference time (s/im) 0.05747
SHOW MORE
SHOW LESS
FoveaBox (R-101, 2x, pytorch, MS train=N, align=N)

Memory (M) 11700.0
Backbone Layers 101
File Size 211.89 MB
Training Data MS COCO
Training Resources 8x NVIDIA V100 GPUs
Training Time

Architecture ResNet, FPN, Focal Loss
MS train N
lr sched 2x
Memory (M) 11700.0
Backbone Layers 101
SHOW MORE
SHOW LESS
FoveaBox (R-101, 2x, pytorch, MS train=N, align=Y)

Memory (M) 11700.0
inference time (s/im) 0.06803
File Size 220.27 MB
Training Data MS COCO
Training Resources 8x NVIDIA V100 GPUs
Training Time

Architecture ResNet, FPN, Focal Loss
MS train N
lr sched 2x
Memory (M) 11700.0
Backbone Layers 101
inference time (s/im) 0.06803
SHOW MORE
SHOW LESS
FoveaBox (R-101, 2x, pytorch, MS train=Y, align=Y)

Memory (M) 11700.0
inference time (s/im) 0.06803
File Size 220.27 MB
Training Data MS COCO
Training Resources 8x NVIDIA V100 GPUs
Training Time

Architecture ResNet, FPN, Focal Loss
MS train Y
lr sched 2x
Memory (M) 11700.0
Backbone Layers 101
inference time (s/im) 0.06803
SHOW MORE
SHOW LESS
FoveaBox (R-50, 1x, pytorch, MS train=N, align=N)

Memory (M) 5600.0
inference time (s/im) 0.04149
File Size 139.19 MB
Training Data MS COCO
Training Resources 8x NVIDIA V100 GPUs
Training Time

Architecture ResNet, FPN, Focal Loss
MS train N
lr sched 1x
Memory (M) 5600.0
Backbone Layers 50
inference time (s/im) 0.04149
SHOW MORE
SHOW LESS
FoveaBox (R-50, 2x, pytorch, MS train=N, align=N)

Memory (M) 5600.0
Backbone Layers 50
File Size 139.19 MB
Training Data MS COCO
Training Resources 8x NVIDIA V100 GPUs
Training Time

Architecture ResNet, FPN, Focal Loss
MS train N
lr sched 2x
Memory (M) 5600.0
Backbone Layers 50
SHOW MORE
SHOW LESS
FoveaBox (R-50, 2x, pytorch, MS train=N, align=Y)

Memory (M) 8100.0
inference time (s/im) 0.05155
File Size 147.57 MB
Training Data MS COCO
Training Resources 8x NVIDIA V100 GPUs
Training Time

Architecture ResNet, FPN, Focal Loss
MS train N
lr sched 2x
Memory (M) 8100.0
Backbone Layers 50
inference time (s/im) 0.05155
SHOW MORE
SHOW LESS
FoveaBox (R-50, 2x, pytorch, MS train=Y, align=Y)

Memory (M) 8100.0
inference time (s/im) 0.05464
File Size 147.57 MB
Training Data MS COCO
Training Resources 8x NVIDIA V100 GPUs
Training Time

Architecture ResNet, FPN, Focal Loss
MS train Y
lr sched 2x
Memory (M) 8100.0
Backbone Layers 50
inference time (s/im) 0.05464
SHOW MORE
SHOW LESS
README.md

FoveaBox: Beyond Anchor-based Object Detector

[ALGORITHM]

FoveaBox is an accurate, flexible and completely anchor-free object detection system for object detection framework, as presented in our paper https://arxiv.org/abs/1904.03797: Different from previous anchor-based methods, FoveaBox directly learns the object existing possibility and the bounding box coordinates without anchor reference. This is achieved by: (a) predicting category-sensitive semantic maps for the object existing possibility, and (b) producing category-agnostic bounding box for each position that potentially contains an object.

Main Results

Results on R50/101-FPN

Backbone Style align ms-train Lr schd Mem (GB) Inf time (fps) box AP Config Download
R-50 pytorch N N 1x 5.6 24.1 36.5 config model | log
R-50 pytorch N N 2x 5.6 - 37.2 config model | log
R-50 pytorch Y N 2x 8.1 19.4 37.9 config model | log
R-50 pytorch Y Y 2x 8.1 18.3 40.4 config model | log
R-101 pytorch N N 1x 9.2 17.4 38.6 config model | log
R-101 pytorch N N 2x 11.7 - 40.0 config model | log
R-101 pytorch Y N 2x 11.7 14.7 40.0 config model | log
R-101 pytorch Y Y 2x 11.7 14.7 42.0 config model | log

[1] 1x and 2x mean the model is trained for 12 and 24 epochs, respectively. \ [2] Align means utilizing deformable convolution to align the cls branch. \ [3] All results are obtained with a single model and without any test time data augmentation.\ [4] We use 4 GPUs for training.

Any pull requests or issues are welcome.

Citations

Please consider citing our paper in your publications if the project helps your research. BibTeX reference is as follows.

@article{kong2019foveabox,
  title={FoveaBox: Beyond Anchor-based Object Detector},
  author={Kong, Tao and Sun, Fuchun and Liu, Huaping and Jiang, Yuning and Shi, Jianbo},
  journal={arXiv preprint arXiv:1904.03797},
  year={2019}
}

Results

Object Detection on COCO minival

Object Detection on COCO minival
MODEL BOX AP
FoveaBox (R-101, 2x, pytorch, MS train=Y, align=Y) 42.0
FoveaBox (R-50, 2x, pytorch, MS train=Y, align=Y) 40.4
FoveaBox (R-101, 2x, pytorch, MS train=N, align=N) 40.0
FoveaBox (R-101, 2x, pytorch, MS train=N, align=Y) 40.0
FoveaBox (R-101, 1x, pytorch, MS train=N, align=N) 38.6
FoveaBox (R-50, 2x, pytorch, MS train=N, align=Y) 37.9
FoveaBox (R-50, 2x, pytorch, MS train=N, align=N) 37.2
FoveaBox (R-50, 1x, pytorch, MS train=N, align=N) 36.5