no code implementations • 18 Jan 2024 • Kohei Uehara, Nabarun Goswami, Hanqin Wang, Toshiaki Baba, Kohtaro Tanaka, Tomohiro Hashimoto, Kai Wang, Rei Ito, Takagi Naoya, Ryo Umagami, Yingyi Wen, Tanachai Anakewat, Tatsuya Harada
The increasing demand for intelligent systems capable of interpreting and reasoning about visual content requires the development of Large Multi-Modal Models (LMMs) that are not only accurate but also have explicit reasoning capabilities.
no code implementations • 12 Oct 2022 • Kohei Uehara, Tatsuya Harada
Our pipeline consists of two components: the Object Classifier, which performs knowledge-based object recognition, and the Question Generator, which generates knowledge-aware questions to acquire novel knowledge.
no code implementations • 15 Mar 2022 • Kohei Uehara, Tatsuya Harada
Visual Question Generation (VQG) is a task to generate questions from images.
no code implementations • 15 Feb 2022 • Kohei Uehara, Yusuke Mori, Yusuke Mukuta, Tatsuya Harada
Image narrative generation is a task to create a story from an image with a subjective viewpoint.
no code implementations • EMNLP (nlpbt) 2020 • Kohei Uehara, Tatsuya Harada
In the majority of the existing Visual Question Answering (VQA) research, the answers consist of short, often single words, as per instructions given to the annotators during dataset construction.
no code implementations • 7 May 2019 • Sho Maeoki, Kohei Uehara, Tatsuya Harada
We propose a system to retrieve videos by asking questions about the content of the videos and leveraging the user's responses to the questions.
1 code implementation • ECCV 2018 • Kohei Uehara, Antonio Tejero-de-Pablos, Yoshitaka Ushiku, Tatsuya Harada
In this paper, we propose a method for generating questions about unknown objects in an image, as means to get information about classes that have not been learned.