A ResNeXt repeats a building block that aggregates a set of transformations with the same topology. Compared to a ResNet, it exposes a new dimension, cardinality (the size of the set of transformations) $C$, as an essential factor in addition to the dimensions of depth and width.
Formally, a set of aggregated transformations can be represented as: $\mathcal{F}(x)=\sum_{i=1}^{C}\mathcal{T}_i(x)$, where $\mathcal{T}_i(x)$ can be an arbitrary function. Analogous to a simple neuron, $\mathcal{T}_i$ should project $x$ into an (optionally low-dimensional) embedding and then transform it.
Source: Aggregated Residual Transformations for Deep Neural NetworksPaper | Code | Results | Date | Stars |
---|
Task | Papers | Share |
---|---|---|
Object Detection | 37 | 13.96% |
Image Classification | 28 | 10.57% |
Semantic Segmentation | 23 | 8.68% |
General Classification | 19 | 7.17% |
Instance Segmentation | 15 | 5.66% |
Classification | 8 | 3.02% |
Action Recognition | 7 | 2.64% |
Panoptic Segmentation | 6 | 2.26% |
Pose Estimation | 4 | 1.51% |