Medical Image Segmentation via Cascaded Attention Decoding
Transformers have shown great promise in medical image segmentation due to their ability to capture long-range dependencies through self-attention. However, they lack the ability to learn the local (contextual) relations among pixels. Previous works try to overcome this problem by embedding convolutional layers either in the encoder or decoder modules of transformers thus ending up sometimes with inconsistent features. To address this issue, we propose a novel attention-based decoder, namely CASCaded Attention DEcoder (CASCADE), which leverages the multiscale features of hierarchical vision transformers. CASCADE consists of i) an attention gate which fuses features with skip connections and ii) a convolutional attention module that enhances the long-range and local context by suppressing background information. We use a multi-stage feature and loss aggregation framework due to their faster convergence and better performance. Our experiments demonstrate that transformers with CASCADE significantly outperform state-of-the-art CNN- and transformer-based approaches, obtaining up to 5.07% and 6.16% improvements in DICE and mIoU scores, respectively. CASCADE opens new ways of designing better attention-based decoders.
PDF AbstractCode
Task | Dataset | Model | Metric Name | Metric Value | Global Rank | Benchmark |
---|---|---|---|---|---|---|
Medical Image Segmentation | Automatic Cardiac Diagnosis Challenge (ACDC) | PVT-CASCADE | Avg DSC | 91.46 | # 5 | |
Medical Image Segmentation | Automatic Cardiac Diagnosis Challenge (ACDC) | TransCASCADE | Avg DSC | 91.63 | # 4 | |
Medical Image Segmentation | CVC-ClinicDB | PVT-CASCADE | mean Dice | 0.9434 | # 10 | |
mIoU | 0.8998 | # 6 | ||||
Medical Image Segmentation | CVC-ColonDB | PVT-CASCADE | mean Dice | 0.8254 | # 4 | |
mIoU | 0.7453 | # 4 | ||||
Medical Image Segmentation | ETIS-LARIBPOLYPDB | PVT-CASCADE | mIoU | 0.7258 | # 6 | |
mean Dice | 0.8007 | # 4 | ||||
Medical Image Segmentation | Kvasir-SEG | PVT-CASCADE | mean Dice | 0.9258 | # 13 | |
mIoU | 0.8776 | # 12 | ||||
Polyp Segmentation | Kvasir-SEG | PVT-CASCADE | DSC | 0.9258 | # 6 | |
mIoU | 0.8776 | # 2 | ||||
Medical Image Segmentation | Synapse multi-organ CT | PVT-CASCADE | Avg DSC | 81.06 | # 12 | |
Avg HD | 20.23 | # 8 | ||||
Medical Image Segmentation | Synapse multi-organ CT | TransCASCADE | Avg DSC | 82.68 | # 8 | |
Avg HD | 17.34 | # 6 |