Search Results for author: Mihiro Uchida

Found 2 papers, 0 papers with code

End-to-End Joint Target and Non-Target Speakers ASR

no code implementations • 4 Jun 2023 • Ryo Masumura, Naoki Makishima, Taiga Yamane, Yoshihiko Yamazaki, Saki Mizuno, Mana Ihori, Mihiro Uchida, Keita Suzuki, Hiroshi Sato, Tomohiro Tanaka, Akihiko Takashima, Satoshi Suzuki, Takafumi Moriya, Nobukatsu Hojo, Atsushi Ando

Target-speaker ASR systems are a promising way to only transcribe a target speaker's speech by enrolling the target speaker's information.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Paper
Add Code

Audio Visual Scene-Aware Dialog Generation with Transformer-based Video Representations

no code implementations • 21 Feb 2022 • Yoshihiro Yamazaki, Shota Orihashi, Ryo Masumura, Mihiro Uchida, Akihiko Takashima

There have been many attempts to build multimodal dialog systems that can respond to a question about given audio-visual information, and the representative task for such systems is the Audio Visual Scene-Aware Dialog (AVSD).

Answer Generation Video Understanding

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.