DLA

Last updated on Feb 14, 2021

dla102

Parameters 33 Million
FLOPs 7 Billion
File Size 129.02 MB
Training Data ImageNet
Training Resources 8x GPUs
Training Time

Training Techniques SGD with Momentum, Weight Decay
Architecture 1x1 Convolution, Batch Normalization, Convolution, DLA Residual Block, DLA Bottleneck Residual Block, Global Average Pooling, Residual Block, Residual Connection, ReLU, Max Pooling, Softmax
ID dla102
LR 0.1
Epochs 120
Layers 102
Crop Pct 0.875
Momentum 0.9
Batch Size 256
Image Size 224
Weight Decay 0.0001
Interpolation bilinear
SHOW MORE
SHOW LESS
dla102x

Parameters 26 Million
FLOPs 6 Billion
File Size 102.57 MB
Training Data ImageNet
Training Resources 8x GPUs
Training Time

Training Techniques SGD with Momentum, Weight Decay
Architecture 1x1 Convolution, Batch Normalization, Convolution, DLA Residual Block, DLA Bottleneck Residual Block, Global Average Pooling, Residual Block, Residual Connection, ReLU, Max Pooling, Softmax
ID dla102x
LR 0.1
Epochs 120
Layers 102
Crop Pct 0.875
Momentum 0.9
Batch Size 256
Image Size 224
Weight Decay 0.0001
Interpolation bilinear
SHOW MORE
SHOW LESS
dla102x2

Parameters 41 Million
FLOPs 9 Billion
File Size 159.88 MB
Training Data ImageNet
Training Resources 8x GPUs
Training Time

Training Techniques SGD with Momentum, Weight Decay
Architecture 1x1 Convolution, Batch Normalization, Convolution, DLA Residual Block, DLA Bottleneck Residual Block, Global Average Pooling, Residual Block, Residual Connection, ReLU, Max Pooling, Softmax
ID dla102x2
LR 0.1
Epochs 120
Layers 102
Crop Pct 0.875
Momentum 0.9
Batch Size 256
Image Size 224
Weight Decay 0.0001
Interpolation bilinear
SHOW MORE
SHOW LESS
dla169

Parameters 53 Million
FLOPs 12 Billion
File Size 206.52 MB
Training Data ImageNet
Training Resources 8x GPUs
Training Time

Training Techniques SGD with Momentum, Weight Decay
Architecture 1x1 Convolution, Batch Normalization, Convolution, DLA Residual Block, DLA Bottleneck Residual Block, Global Average Pooling, Residual Block, Residual Connection, ReLU, Max Pooling, Softmax
ID dla169
LR 0.1
Epochs 120
Layers 169
Crop Pct 0.875
Momentum 0.9
Batch Size 256
Image Size 224
Weight Decay 0.0001
Interpolation bilinear
SHOW MORE
SHOW LESS
dla34

Parameters 16 Million
FLOPs 3 Billion
File Size 60.30 MB
Training Data ImageNet
Training Resources
Training Time

Training Techniques SGD with Momentum, Weight Decay
Architecture 1x1 Convolution, Batch Normalization, Convolution, DLA Residual Block, DLA Bottleneck Residual Block, Global Average Pooling, Residual Block, Residual Connection, ReLU, Max Pooling, Softmax
ID dla34
LR 0.1
Epochs 120
Layers 32
Crop Pct 0.875
Momentum 0.9
Batch Size 256
Image Size 224
Weight Decay 0.0001
Interpolation bilinear
SHOW MORE
SHOW LESS
dla46_c

Parameters 1 Million
FLOPs 583 Million
File Size 5.06 MB
Training Data ImageNet
Training Resources
Training Time

Training Techniques SGD with Momentum, Weight Decay
Architecture 1x1 Convolution, Batch Normalization, Convolution, DLA Residual Block, DLA Bottleneck Residual Block, Global Average Pooling, Residual Block, Residual Connection, ReLU, Max Pooling, Softmax
ID dla46_c
LR 0.1
Epochs 120
Layers 46
Crop Pct 0.875
Momentum 0.9
Batch Size 256
Image Size 224
Weight Decay 0.0001
Interpolation bilinear
SHOW MORE
SHOW LESS
dla46x_c

Parameters 1 Million
FLOPs 544 Million
File Size 4.18 MB
Training Data ImageNet
Training Resources
Training Time

Training Techniques SGD with Momentum, Weight Decay
Architecture 1x1 Convolution, Batch Normalization, Convolution, DLA Residual Block, DLA Bottleneck Residual Block, Global Average Pooling, Residual Block, Residual Connection, ReLU, Max Pooling, Softmax
ID dla46x_c
LR 0.1
Epochs 120
Layers 46
Crop Pct 0.875
Momentum 0.9
Batch Size 256
Image Size 224
Weight Decay 0.0001
Interpolation bilinear
SHOW MORE
SHOW LESS
dla60

Parameters 22 Million
FLOPs 4 Billion
File Size 85.41 MB
Training Data ImageNet
Training Resources
Training Time

Training Techniques SGD with Momentum, Weight Decay
Architecture 1x1 Convolution, Batch Normalization, Convolution, DLA Residual Block, DLA Bottleneck Residual Block, Global Average Pooling, Residual Block, Residual Connection, ReLU, Max Pooling, Softmax
ID dla60
LR 0.1
Epochs 120
Layers 60
Dropout 0.2
Crop Pct 0.875
Momentum 0.9
Batch Size 256
Image Size 224
Weight Decay 0.0001
Interpolation bilinear
SHOW MORE
SHOW LESS
dla60_res2net

Parameters 21 Million
FLOPs 4 Billion
File Size 80.95 MB
Training Data ImageNet
Training Resources
Training Time

Training Techniques SGD with Momentum, Weight Decay
Architecture 1x1 Convolution, Batch Normalization, Convolution, DLA Residual Block, DLA Bottleneck Residual Block, Global Average Pooling, Residual Block, Residual Connection, ReLU, Max Pooling, Softmax
ID dla60_res2net
Layers 60
Crop Pct 0.875
Image Size 224
Interpolation bilinear
SHOW MORE
SHOW LESS
dla60_res2next

Parameters 17 Million
FLOPs 3 Billion
File Size 66.41 MB
Training Data ImageNet
Training Resources
Training Time

Training Techniques SGD with Momentum, Weight Decay
Architecture 1x1 Convolution, Batch Normalization, Convolution, DLA Residual Block, DLA Bottleneck Residual Block, Global Average Pooling, Residual Block, Residual Connection, ReLU, Max Pooling, Softmax
ID dla60_res2next
Layers 60
Crop Pct 0.875
Image Size 224
Interpolation bilinear
SHOW MORE
SHOW LESS
dla60x

Parameters 17 Million
FLOPs 4 Billion
File Size 67.60 MB
Training Data ImageNet
Training Resources
Training Time

Training Techniques SGD with Momentum, Weight Decay
Architecture 1x1 Convolution, Batch Normalization, Convolution, DLA Residual Block, DLA Bottleneck Residual Block, Global Average Pooling, Residual Block, Residual Connection, ReLU, Max Pooling, Softmax
ID dla60x
LR 0.1
Epochs 120
Layers 60
Crop Pct 0.875
Momentum 0.9
Batch Size 256
Image Size 224
Weight Decay 0.0001
Interpolation bilinear
SHOW MORE
SHOW LESS
dla60x_c

Parameters 1 Million
FLOPs 593 Million
File Size 5.20 MB
Training Data ImageNet
Training Resources
Training Time

Training Techniques SGD with Momentum, Weight Decay
Architecture 1x1 Convolution, Batch Normalization, Convolution, DLA Residual Block, DLA Bottleneck Residual Block, Global Average Pooling, Residual Block, Residual Connection, ReLU, Max Pooling, Softmax
ID dla60x_c
LR 0.1
Epochs 120
Layers 60
Crop Pct 0.875
Momentum 0.9
Batch Size 256
Image Size 224
Weight Decay 0.0001
Interpolation bilinear
SHOW MORE
SHOW LESS
README.md

Summary

Extending “shallow” skip connections, Dense Layer Aggregation (DLA) incorporates more depth and sharing. The authors introduce two structures for deep layer aggregation (DLA): iterative deep aggregation (IDA) and hierarchical deep aggregation (HDA). These structures are expressed through an architectural framework, independent of the choice of backbone, for compatibility with current and future networks.

IDA focuses on fusing resolutions and scales while HDA focuses on merging features from all modules and channels. IDA follows the base hierarchy to refine resolution and aggregate scale stage-bystage. HDA assembles its own hierarchy of tree-structured connections that cross and merge stages to aggregate different levels of representation.

How do I load this model?

To load a pretrained model:

import timm
m = timm.create_model('dla34', pretrained=True)
m.eval()

Replace the model name with the variant you want to use, e.g. dla34. You can find the IDs in the model summaries at the top of this page.

How do I train this model?

You can follow the timm recipe scripts for training a new model afresh.

Citation

@misc{yu2019deep,
      title={Deep Layer Aggregation}, 
      author={Fisher Yu and Dequan Wang and Evan Shelhamer and Trevor Darrell},
      year={2019},
      eprint={1707.06484},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Results

Image Classification on ImageNet

Image Classification on ImageNet
MODEL TOP 1 ACCURACY TOP 5 ACCURACY
dla102x2 79.44% 94.65%
dla169 78.69% 94.33%
dla102x 78.51% 94.23%
dla60_res2net 78.46% 94.21%
dla60_res2next 78.44% 94.16%
dla60x 78.25% 94.02%
dla102 78.03% 93.95%
dla60 77.04% 93.32%
dla34 74.62% 92.06%
dla60x_c 67.91% 88.42%
dla46x_c 65.98% 86.99%
dla46_c 64.87% 86.29%