1 code implementation • 9 May 2018 • Zhou Yu, Jun Yu, Chenchao Xiang, Zhou Zhao, Qi Tian, DaCheng Tao
Visual grounding aims to localize an object in an image referred to by a textual query phrase.
Ranked #9 on Phrase Grounding on Flickr30k Entities Test
2 code implementations • 10 Aug 2017 • Zhou Yu, Jun Yu, Chenchao Xiang, Jianping Fan, DaCheng Tao
For fine-grained image and question representations, a `co-attention' mechanism is developed by using a deep neural network architecture to jointly learn the attentions for both the image and the question, which can allow us to reduce the irrelevant features effectively and obtain more discriminative features for image and question representations.