An Efficient Spatial Pyramid (ESP) is an image model block based on a factorization principle that decomposes a standard convolution into two steps: (1) point-wise convolutions and (2) spatial pyramid of dilated convolutions. The point-wise convolutions help in reducing the computation, while the spatial pyramid of dilated convolutions re-samples the feature maps to learn the representations from large effective receptive field. This allows for increased efficiency compared to another image blocks like ResNeXt blocks and Inception modules.
Source: ESPNet: Efficient Spatial Pyramid of Dilated Convolutions for Semantic SegmentationPaper | Code | Results | Date | Stars |
---|
Task | Papers | Share |
---|---|---|
Speech Recognition | 10 | 21.74% |
Automatic Speech Recognition | 8 | 17.39% |
Semantic Segmentation | 5 | 10.87% |
reinforcement Learning | 3 | 6.52% |
Speech Separation | 2 | 4.35% |
Language Modelling | 2 | 4.35% |
Real-Time Semantic Segmentation | 2 | 4.35% |
Robust Speech Recognition | 1 | 2.17% |
Speech Enhancement | 1 | 2.17% |
Component | Type |
|
---|---|---|
![]() |
Convolutions | |
![]() |
Degridding | |
![]() |
Convolutions |