An Efficient Spatial Pyramid (ESP) is an image model block based on a factorization principle that decomposes a standard convolution into two steps: (1) point-wise convolutions and (2) spatial pyramid of dilated convolutions. The point-wise convolutions help in reducing the computation, while the spatial pyramid of dilated convolutions re-samples the feature maps to learn the representations from large effective receptive field. This allows for increased efficiency compared to another image blocks like ResNeXt blocks and Inception modules.
Source: ESPNet: Efficient Spatial Pyramid of Dilated Convolutions for Semantic SegmentationPaper | Code | Results | Date | Stars |
---|
Task | Papers | Share |
---|---|---|
Speech Recognition | 10 | 19.23% |
Automatic Speech Recognition (ASR) | 8 | 15.38% |
Semantic Segmentation | 5 | 9.62% |
Language Modelling | 3 | 5.77% |
Reinforcement Learning (RL) | 3 | 5.77% |
Speech Separation | 2 | 3.85% |
Real-Time Semantic Segmentation | 2 | 3.85% |
Text Generation | 1 | 1.92% |
Visual Commonsense Reasoning | 1 | 1.92% |
Component | Type |
|
---|---|---|
![]() |
Convolutions | |
![]() |
Degridding | |
![]() |
Convolutions |