In this paper, we propose the Self-Attention Generative Adversarial Network (SAGAN) which allows attention-driven, long-range dependency modeling for image generation tasks. Traditional convolutional GANs generate high-resolution details as a function of only spatially local points in lower-resolution feature maps. In SAGAN, details can be generated using cues from all feature locations.
|Task||Dataset||Model||Metric name||Metric value||Global rank||Compare|
|Conditional Image Generation||ImageNet 128x128||Self-attention||FID||18.65||# 2|
|Conditional Image Generation||ImageNet 128x128||Self-attention||Inception score||52.52||# 2|