1 code implementation • 5 Aug 2021 • Xinzhi Dong, Chengjiang Long, Wenju Xu, Chunxia Xiao
With the well-designed Dual-GCN, we can make the linguistic transformer better understand the relationship between different objects in a single image and make full use of similar images as auxiliary information to generate a reasonable caption description for a single image.