1 code implementation • 17 Jun 2022 • Yifeng Zhuang, Qiang Sun, Yanwei Fu, Lifeng Chen, xiangyang xue
Since the attention mechanism in the transformer architecture can better integrate inter- and intra-modal information of vision and language.
Navigate Vision and Language Navigation