nnFormer, or not-another transFormer, is a semantic segmentation model with an interleaved architecture based on empirical combination of self-attention and convolution. Firstly, a light-weight convolutional embedding layer ahead is used ahead of transformer blocks. In comparison to directly flattening raw pixels and applying 1D pre-processing, the convolutional embedding layer encodes precise (i.e., pixel-level) spatial information and provide low-level yet high-resolution 3D features. After the embedding block, transformer and convolutional down-sampling blocks are interleaved to fully entangle long-term dependencies with high-level and hierarchical object concepts at various scales, which helps improve the generalization ability and robustness of learned representations.
Source: nnFormer: Interleaved Transformer for Volumetric SegmentationPaper | Code | Results | Date | Stars |
---|
Task | Papers | Share |
---|---|---|
Image Segmentation | 4 | 25.00% |
Semantic Segmentation | 4 | 25.00% |
Medical Image Segmentation | 2 | 12.50% |
Mamba | 2 | 12.50% |
3D Medical Imaging Segmentation | 1 | 6.25% |
Domain Adaptation | 1 | 6.25% |
MRI segmentation | 1 | 6.25% |
Volumetric Medical Image Segmentation | 1 | 6.25% |