nnFormer

Introduced by Zhou et al. in nnFormer: Interleaved Transformer for Volumetric Segmentation

nnFormer, or not-another transFormer, is a semantic segmentation model with an interleaved architecture based on empirical combination of self-attention and convolution. Firstly, a light-weight convolutional embedding layer ahead is used ahead of transformer blocks. In comparison to directly flattening raw pixels and applying 1D pre-processing, the convolutional embedding layer encodes precise (i.e., pixel-level) spatial information and provide low-level yet high-resolution 3D features. After the embedding block, transformer and convolutional down-sampling blocks are interleaved to fully entangle long-term dependencies with high-level and hierarchical object concepts at various scales, which helps improve the generalization ability and robustness of learned representations.

Source: nnFormer: Interleaved Transformer for Volumetric Segmentation

Read Paper See Code

Papers

Paper	Code	Results	Date	Stars

Tasks

Task	Papers	Share
Image Segmentation	2	33.33%
Semantic Segmentation	2	33.33%
Medical Image Segmentation	1	16.67%
Volumetric Medical Image Segmentation	1	16.67%

Usage Over Time

This feature is experimental; we are continuously improving our matching algorithm.

Components

Component	Type	Add Remove
Convolution	Convolutions
GELU	Activation Functions
Layer Normalization	Normalization
Multi-Head Attention	Attention Modules
Residual Connection	Skip Connections
Scaled Dot-Product Attention	Attention Mechanisms

Categories

Add Remove

Semantic Segmentation Models

Vision Transformers