Search Results for author: Shuhei Kurita

Found 13 papers, 6 papers with code

Vision Language Model-based Caption Evaluation Method Leveraging Visual Context Extraction

no code implementations28 Feb 2024 Koki Maeda, Shuhei Kurita, Taiki Miyanishi, Naoaki Okazaki

Given the accelerating progress of vision and language modeling, accurate evaluation of machine-generated image captions remains critical.

Image Captioning Language Modelling

SlideAVSR: A Dataset of Paper Explanation Videos for Audio-Visual Speech Recognition

no code implementations18 Jan 2024 Hao Wang, Shuhei Kurita, Shuichiro Shimizu, Daisuke Kawahara

Audio-visual speech recognition (AVSR) is a multimodal extension of automatic speech recognition (ASR), using video as a complement to audio.

Audio-Visual Speech Recognition Automatic Speech Recognition +4

RefEgo: Referring Expression Comprehension Dataset from First-Person Perception of Ego4D

1 code implementation ICCV 2023 Shuhei Kurita, Naoki Katsura, Eri Onami

In the conventional referring expression comprehension tasks of images, however, datasets are mostly constructed based on the web-crawled data and don't reflect diverse real-world structures on the task of grounding textual expressions in diverse objects in the real world.

Object Object Tracking +2

Cross3DVG: Cross-Dataset 3D Visual Grounding on Different RGB-D Scans

1 code implementation23 May 2023 Taiki Miyanishi, Daichi Azuma, Shuhei Kurita, Motoki Kawanabe

We present a novel task for cross-dataset visual grounding in 3D scenes (Cross3DVG), which overcomes limitations of existing 3D visual grounding models, specifically their restricted 3D resources and consequent tendencies of overfitting a specific 3D dataset.

3D Reconstruction Visual Grounding

Generative Language-Grounded Policy in Vision-and-Language Navigation with Bayes' Rule

no code implementations ICLR 2021 Shuhei Kurita, Kyunghyun Cho

Vision-and-language navigation (VLN) is a task in which an agent is embodied in a realistic 3D environment and follows an instruction to reach the goal node.

Language Modelling Vision and Language Navigation

Neural Adversarial Training for Semi-supervised Japanese Predicate-argument Structure Analysis

no code implementations ACL 2018 Shuhei Kurita, Daisuke Kawahara, Sadao Kurohashi

Japanese predicate-argument structure (PAS) analysis involves zero anaphora resolution, which is notoriously difficult.

Cannot find the paper you are looking for? You can Submit a new open access paper.