1 code implementation • Findings (EMNLP) 2021 • Kouta Nakayama, Shuhei Kurita, Akio Kobayashi, Yukino Baba, Satoshi Sekine
In this research, we propose a scheme to utilize all those systems which participated in the shared tasks.
no code implementations • COLING 2022 • Shuhei Kurita, Hiroki Ouchi, Kentaro Inui, Satoshi Sekine
Semantic Role Labeling (SRL) is the task of labeling semantic arguments for marked semantic predicates.
no code implementations • 28 Feb 2024 • Koki Maeda, Shuhei Kurita, Taiki Miyanishi, Naoaki Okazaki
Given the accelerating progress of vision and language modeling, accurate evaluation of machine-generated image captions remains critical.
no code implementations • 18 Jan 2024 • Hao Wang, Shuhei Kurita, Shuichiro Shimizu, Daisuke Kawahara
Audio-visual speech recognition (AVSR) is a multimodal extension of automatic speech recognition (ASR), using video as a complement to audio.
Audio-Visual Speech Recognition Automatic Speech Recognition +4
1 code implementation • NeurIPS 2023 • Taiki Miyanishi, Fumiya Kitamori, Shuhei Kurita, Jungdae Lee, Motoaki Kawanabe, Nakamasa Inoue
To tackle this problem, we introduce the CityRefer dataset for city-level visual grounding.
1 code implementation • ICCV 2023 • Shuhei Kurita, Naoki Katsura, Eri Onami
In the conventional referring expression comprehension tasks of images, however, datasets are mostly constructed based on the web-crawled data and don't reflect diverse real-world structures on the task of grounding textual expressions in diverse objects in the real world.
1 code implementation • 23 May 2023 • Taiki Miyanishi, Daichi Azuma, Shuhei Kurita, Motoki Kawanabe
We present a novel task for cross-dataset visual grounding in 3D scenes (Cross3DVG), which overcomes limitations of existing 3D visual grounding models, specifically their restricted 3D resources and consequent tendencies of overfitting a specific 3D dataset.
no code implementations • COLING 2022 • Keisuke Shirai, Atsushi Hashimoto, Taichi Nishimura, Hirotaka Kameko, Shuhei Kurita, Yoshitaka Ushiku, Shinsuke Mori
We present a new multimodal dataset called Visual Recipe Flow, which enables us to learn each cooking action result in a recipe text.
1 code implementation • CVPR 2022 • Daichi Azuma, Taiki Miyanishi, Shuhei Kurita, Motoaki Kawanabe
We propose a new 3D spatial understanding task of 3D Question Answering (3D-QA).
no code implementations • ICLR 2021 • Shuhei Kurita, Kyunghyun Cho
Vision-and-language navigation (VLN) is a task in which an agent is embodied in a realistic 3D environment and follows an instruction to reach the goal node.
1 code implementation • ACL 2019 • Shuhei Kurita, Anders Søgaard
In Semantic Dependency Parsing (SDP), semantic relations form directed acyclic graphs, rather than trees.
no code implementations • ACL 2018 • Shuhei Kurita, Daisuke Kawahara, Sadao Kurohashi
Japanese predicate-argument structure (PAS) analysis involves zero anaphora resolution, which is notoriously difficult.
no code implementations • ACL 2017 • Shuhei Kurita, Daisuke Kawahara, Sadao Kurohashi
We present neural network-based joint models for Chinese word segmentation, POS tagging and dependency parsing.