no code implementations • 1 Nov 2022 • Hongyang He, Feng Ziliang, Yuanhang Zheng, Shudong Huang, HaoBing Gao
In the field of medical CT image processing, convolutional neural networks (CNNs) have been the dominant technique. Encoder-decoder CNNs utilise locality for efficiency, but they cannot simulate distant pixel interactions properly. Recent research indicates that self-attention or transformer layers can be stacked to efficiently learn long-range dependencies. By constructing and processing picture patches as embeddings, transformers have been applied to computer vision applications.