no code implementations • CCL 2021 • Wan Zhang, Chen Keming, Zhang Yujie, Xu Jinan, Chen Yufeng
“The predominant approach of visual question answering (VQA) relies on encoding the imageand question with a ”black box” neural encoder and decoding a single token into answers suchas ”yes” or ”no”.