2 code implementations • CVPR 2017 • Hyeonseob Nam, Jung-Woo Ha, Jeonghee Kim
We propose Dual Attention Networks (DANs) which jointly leverage visual and textual attention mechanisms to capture fine-grained interplay between vision and language.
Ranked #2 on Visual Question Answering (VQA) on VQA v1 test-dev
8 code implementations • 14 Oct 2016 • Jin-Hwa Kim, Kyoung-Woon On, Woosang Lim, Jeonghee Kim, Jung-Woo Ha, Byoung-Tak Zhang
Bilinear models provide rich representations compared with linear models.
1 code implementation • NeurIPS 2016 • Jin-Hwa Kim, Sang-Woo Lee, Dong-Hyun Kwak, Min-Oh Heo, Jeonghee Kim, Jung-Woo Ha, Byoung-Tak Zhang
We present Multimodal Residual Networks (MRN) for the multimodal residual learning of visual question-answering, which extends the idea of the deep residual learning.
no code implementations • 15 Jun 2015 • Sang-Woo Lee, Min-Oh Heo, Jiwon Kim, Jeonghee Kim, Byoung-Tak Zhang
The proposed architecture consists of deep representation learners and fast learnable shallow kernel networks, both of which synergize to track the information of new data.