1 code implementation • 9 Oct 2023 • Xianming Gu, Lihui Wang, Zeyu Deng, Ying Cao, Xingyu Huang, Yue-Min Zhu
Specifically, we propose the cross-attention fusion (CAF) block, which adaptively fuses features of two modalities in the spatial and frequency domains by exchanging key and query values, and then calculates the cross-attention scores between the spatial and frequency features to further guide the spatial-frequential information fusion.