Textually Enriched Neural Module Networks for Visual Question Answering

23 Sep 2018Khyathi Raghavi ChanduMary Arpita PyreddyMatthieu FelixNarendra Nath Joshi

Problems at the intersection of language and vision, like visual question answering, have recently been gaining a lot of attention in the field of multi-modal machine learning as computer vision research moves beyond traditional recognition tasks. There has been recent success in visual question answering using deep neural network models which use the linguistic structure of the questions to dynamically instantiate network layouts... (read more)

PDF Abstract

Results from the Paper

  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.