ESE VovNet

Last updated on Feb 14, 2021

ese_vovnet19b_dw

Parameters 7 Million
FLOPs 2 Billion
File Size 25.03 MB
Training Data ImageNet
Training Resources
Training Time

Architecture Convolution, Max Pooling, One-Shot Aggregation, Batch Normalization, ReLU
ID ese_vovnet19b_dw
Layers 19
Crop Pct 0.875
Image Size 224
Interpolation bicubic
SHOW MORE
SHOW LESS
ese_vovnet39b

Parameters 25 Million
FLOPs 9 Billion
File Size 93.84 MB
Training Data ImageNet
Training Resources
Training Time

Architecture Convolution, Max Pooling, One-Shot Aggregation, Batch Normalization, ReLU
ID ese_vovnet39b
Layers 39
Crop Pct 0.875
Image Size 224
Interpolation bicubic
SHOW MORE
SHOW LESS
README.md

Summary

VoVNet is a convolutional neural network that seeks to make DenseNet more efficient by concatenating all features only once in the last feature map, which makes input size constant and enables enlarging new output channel.

Read about one-shot aggregation here.

How do I load this model?

To load a pretrained model:

import timm
m = timm.create_model('ese_vovnet39b', pretrained=True)
m.eval()

Replace the model name with the variant you want to use, e.g. ese_vovnet39b. You can find the IDs in the model summaries at the top of this page.

How do I train this model?

You can follow the timm recipe scripts for training a new model afresh.

Citation

@misc{lee2019energy,
      title={An Energy and GPU-Computation Efficient Backbone Network for Real-Time Object Detection}, 
      author={Youngwan Lee and Joong-won Hwang and Sangrok Lee and Yuseok Bae and Jongyoul Park},
      year={2019},
      eprint={1904.09730},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Results

Image Classification on ImageNet

Image Classification
BENCHMARK MODEL METRIC NAME METRIC VALUE GLOBAL RANK
ImageNet ese_vovnet39b Top 1 Accuracy 79.31% # 123
Top 5 Accuracy 94.72% # 123
ImageNet ese_vovnet19b_dw Top 1 Accuracy 76.82% # 202
Top 5 Accuracy 93.28% # 202