A (2+1)D Convolution is a type of convolution used for action recognition convolutional neural networks, with a spatiotemporal volume. As opposed to applying a 3D Convolution over the entire volume, which can be computationally expensive and lead to overfitting, a (2+1)D convolution splits computation into two convolutions: a spatial 2D convolution followed by a temporal 1D convolution.
Source: A Closer Look at Spatiotemporal Convolutions for Action RecognitionPaper | Code | Results | Date | Stars |
---|
Task | Papers | Share |
---|---|---|
Action Recognition | 8 | 17.39% |
Retrieval | 4 | 8.70% |
Video Retrieval | 4 | 8.70% |
Temporal Action Localization | 3 | 6.52% |
Optical Flow Estimation | 2 | 4.35% |
Self-Supervised Action Recognition | 2 | 4.35% |
Self-Supervised Learning | 2 | 4.35% |
Video Recognition | 2 | 4.35% |
Action Classification | 2 | 4.35% |
Component | Type |
|
---|---|---|
🤖 No Components Found | You can add them if they exist; e.g. Mask R-CNN uses RoIAlign |