Training Techniques | SGD with Momentum, Weight Decay |
---|---|
Architecture | 1x1 Convolution, Batch Normalization, Convolution, DLA Residual Block, DLA Bottleneck Residual Block, Global Average Pooling, Residual Block, Residual Connection, ReLU, Max Pooling, Softmax |
ID | dla102 |
SHOW MORE |
Training Techniques | SGD with Momentum, Weight Decay |
---|---|
Architecture | 1x1 Convolution, Batch Normalization, Convolution, DLA Residual Block, DLA Bottleneck Residual Block, Global Average Pooling, Residual Block, Residual Connection, ReLU, Max Pooling, Softmax |
ID | dla102x |
SHOW MORE |
Training Techniques | SGD with Momentum, Weight Decay |
---|---|
Architecture | 1x1 Convolution, Batch Normalization, Convolution, DLA Residual Block, DLA Bottleneck Residual Block, Global Average Pooling, Residual Block, Residual Connection, ReLU, Max Pooling, Softmax |
ID | dla102x2 |
SHOW MORE |
Training Techniques | SGD with Momentum, Weight Decay |
---|---|
Architecture | 1x1 Convolution, Batch Normalization, Convolution, DLA Residual Block, DLA Bottleneck Residual Block, Global Average Pooling, Residual Block, Residual Connection, ReLU, Max Pooling, Softmax |
ID | dla169 |
SHOW MORE |
Training Techniques | SGD with Momentum, Weight Decay |
---|---|
Architecture | 1x1 Convolution, Batch Normalization, Convolution, DLA Residual Block, DLA Bottleneck Residual Block, Global Average Pooling, Residual Block, Residual Connection, ReLU, Max Pooling, Softmax |
ID | dla34 |
SHOW MORE |
Training Techniques | SGD with Momentum, Weight Decay |
---|---|
Architecture | 1x1 Convolution, Batch Normalization, Convolution, DLA Residual Block, DLA Bottleneck Residual Block, Global Average Pooling, Residual Block, Residual Connection, ReLU, Max Pooling, Softmax |
ID | dla46_c |
SHOW MORE |
Training Techniques | SGD with Momentum, Weight Decay |
---|---|
Architecture | 1x1 Convolution, Batch Normalization, Convolution, DLA Residual Block, DLA Bottleneck Residual Block, Global Average Pooling, Residual Block, Residual Connection, ReLU, Max Pooling, Softmax |
ID | dla46x_c |
SHOW MORE |
Training Techniques | SGD with Momentum, Weight Decay |
---|---|
Architecture | 1x1 Convolution, Batch Normalization, Convolution, DLA Residual Block, DLA Bottleneck Residual Block, Global Average Pooling, Residual Block, Residual Connection, ReLU, Max Pooling, Softmax |
ID | dla60 |
SHOW MORE |
Training Techniques | SGD with Momentum, Weight Decay |
---|---|
Architecture | 1x1 Convolution, Batch Normalization, Convolution, DLA Residual Block, DLA Bottleneck Residual Block, Global Average Pooling, Residual Block, Residual Connection, ReLU, Max Pooling, Softmax |
ID | dla60_res2net |
SHOW MORE |
Training Techniques | SGD with Momentum, Weight Decay |
---|---|
Architecture | 1x1 Convolution, Batch Normalization, Convolution, DLA Residual Block, DLA Bottleneck Residual Block, Global Average Pooling, Residual Block, Residual Connection, ReLU, Max Pooling, Softmax |
ID | dla60_res2next |
SHOW MORE |
Training Techniques | SGD with Momentum, Weight Decay |
---|---|
Architecture | 1x1 Convolution, Batch Normalization, Convolution, DLA Residual Block, DLA Bottleneck Residual Block, Global Average Pooling, Residual Block, Residual Connection, ReLU, Max Pooling, Softmax |
ID | dla60x |
SHOW MORE |
Training Techniques | SGD with Momentum, Weight Decay |
---|---|
Architecture | 1x1 Convolution, Batch Normalization, Convolution, DLA Residual Block, DLA Bottleneck Residual Block, Global Average Pooling, Residual Block, Residual Connection, ReLU, Max Pooling, Softmax |
ID | dla60x_c |
SHOW MORE |
Extending “shallow” skip connections, Dense Layer Aggregation (DLA) incorporates more depth and sharing. The authors introduce two structures for deep layer aggregation (DLA): iterative deep aggregation (IDA) and hierarchical deep aggregation (HDA). These structures are expressed through an architectural framework, independent of the choice of backbone, for compatibility with current and future networks.
IDA focuses on fusing resolutions and scales while HDA focuses on merging features from all modules and channels. IDA follows the base hierarchy to refine resolution and aggregate scale stage-bystage. HDA assembles its own hierarchy of tree-structured connections that cross and merge stages to aggregate different levels of representation.
To load a pretrained model:
import timm
m = timm.create_model('dla34', pretrained=True)
m.eval()
Replace the model name with the variant you want to use, e.g. dla34
. You can find the IDs in the model summaries at the top of this page.
You can follow the timm recipe scripts for training a new model afresh.
@misc{yu2019deep,
title={Deep Layer Aggregation},
author={Fisher Yu and Dequan Wang and Evan Shelhamer and Trevor Darrell},
year={2019},
eprint={1707.06484},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
MODEL | TOP 1 ACCURACY | TOP 5 ACCURACY |
---|---|---|
dla102x2 | 79.44% | 94.65% |
dla169 | 78.69% | 94.33% |
dla102x | 78.51% | 94.23% |
dla60_res2net | 78.46% | 94.21% |
dla60_res2next | 78.44% | 94.16% |
dla60x | 78.25% | 94.02% |
dla102 | 78.03% | 93.95% |
dla60 | 77.04% | 93.32% |
dla34 | 74.62% | 92.06% |
dla60x_c | 67.91% | 88.42% |
dla46x_c | 65.98% | 86.99% |
dla46_c | 64.87% | 86.29% |