no code implementations • WS 2017 • Kentaro Kanada, Tetsunori Kobayashi, Yoshihiko Hayashi
This paper proposes a method for classifying the type of lexical-semantic relation between a given pair of words.
no code implementations • COLING 2018 • Mao Nakanishi, Tetsunori Kobayashi, Yoshihiko Hayashi
However, to realize human-like language comprehension ability, a machine should also be able to distinguish not-answerable questions (NAQs) from answerable questions.
no code implementations • WS 2019 • Mao Nakanishi, Tetsunori Kobayashi, Yoshihiko Hayashi
Conversational question generation is a novel area of NLP research which has a range of potential applications.
no code implementations • LREC 2020 • Mika Hasegawa, Tetsunori Kobayashi, Yoshihiko Hayashi
Human semantic knowledge about concepts acquired through perceptual inputs and daily experiences can be expressed as a bundle of attributes.
no code implementations • 18 May 2020 • Yosuke Higuchi, Shinji Watanabe, Nanxin Chen, Tetsuji Ogawa, Tetsunori Kobayashi
In this work, Mask CTC model is trained using a Transformer encoder-decoder with joint training of mask prediction and CTC.
Audio and Speech Processing Sound
no code implementations • 26 Oct 2020 • Yosuke Higuchi, Hirofumi Inaguma, Shinji Watanabe, Tetsuji Ogawa, Tetsunori Kobayashi
While Mask-CTC achieves remarkably fast inference speed, its recognition performance falls behind that of conventional autoregressive (AR) systems.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
no code implementations • COLING 2020 • Hiroaki Takatsu, Ryota Ando, Yoichi Matsuyama, Tetsunori Kobayashi
As smart speakers and conversational robots become ubiquitous, the demand for expressive speech synthesis has increased.
no code implementations • COLING 2020 • Hikari Tanabe, Tetsuji Ogawa, Tetsunori Kobayashi, Yoshihiko Hayashi
Recognition of the mental state of a human character in text is a major challenge in natural language processing.
1 code implementation • 8 Oct 2021 • Yosuke Higuchi, Keita Karube, Tetsuji Ogawa, Tetsunori Kobayashi
In this work, to promote the word-level representation learning in end-to-end ASR, we propose a hierarchical conditional model that is based on connectionist temporal classification (CTC).
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
no code implementations • 20 Oct 2021 • Huaibo Zhao, Yosuke Higuchi, Tetsuji Ogawa, Tetsunori Kobayashi
In the present paper, an attempt is made to combine Mask-CTC and the triggered attention mechanism to construct a streaming end-to-end automatic speech recognition (ASR) system that provides high performance with low latency.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1
no code implementations • 29 Oct 2022 • Yosuke Higuchi, Brian Yan, Siddhant Arora, Tetsuji Ogawa, Tetsunori Kobayashi, Shinji Watanabe
This paper presents BERT-CTC, a novel formulation of end-to-end speech recognition that adapts BERT for connectionist temporal classification (CTC).
1 code implementation • 2 Nov 2022 • Yosuke Higuchi, Tetsuji Ogawa, Tetsunori Kobayashi, Shinji Watanabe
This paper presents InterMPL, a semi-supervised learning method of end-to-end automatic speech recognition (ASR) that performs pseudo-labeling (PL) with intermediate supervision.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1
no code implementations • 2 Nov 2022 • Yosuke Higuchi, Tetsuji Ogawa, Tetsunori Kobayashi, Shinji Watanabe
One crucial factor that makes this integration challenging lies in the vocabulary mismatch; the vocabulary constructed for a pre-trained LM is generally too large for E2E-ASR training and is likely to have a mismatch against a target ASR domain.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
no code implementations • 19 Sep 2023 • Yosuke Higuchi, Tetsuji Ogawa, Tetsunori Kobayashi
We present a novel integration of an instruction-tuned large language model (LLM) and end-to-end automatic speech recognition (ASR).
Automatic Speech Recognition Automatic Speech Recognition (ASR) +5
no code implementations • 12 Oct 2023 • Kohei Saijo, Wangyou Zhang, Zhong-Qiu Wang, Shinji Watanabe, Tetsunori Kobayashi, Tetsuji Ogawa
We propose a multi-task universal speech enhancement (MUSE) model that can perform five speech enhancement (SE) tasks: dereverberation, denoising, speech separation (SS), target speaker extraction (TSE), and speaker counting.
no code implementations • COLING 2022 • Masato Takatsuka, Tetsunori Kobayashi, Yoshihiko Hayashi
Although the fluency of automatically generated abstractive summaries has improved significantly with advanced methods, the inconsistency that remains in summarization is recognized as an issue to be addressed.