no code implementations • 26 Mar 2022 • Kohei Saijo, Tetsuji Ogawa
A new learning algorithm for speech separation networks is designed to explicitly reduce residual noise and artifacts in the separated signal in an unsupervised manner.
no code implementations • 20 Oct 2021 • Huaibo Zhao, Yosuke Higuchi, Tetsuji Ogawa, Tetsunori Kobayashi
In the present paper, an attempt is made to combine Mask-CTC and the triggered attention mechanism to construct a streaming end-to-end automatic speech recognition (ASR) system that provides high performance with low latency.
1 code implementation • 8 Oct 2021 • Yosuke Higuchi, Keita Karube, Tetsuji Ogawa, Tetsunori Kobayashi
In this work, to promote the word-level representation learning in end-to-end ASR, we propose a hierarchical conditional model that is based on connectionist temporal classification (CTC).
no code implementations • COLING 2020 • Hikari Tanabe, Tetsuji Ogawa, Tetsunori Kobayashi, Yoshihiko Hayashi
Recognition of the mental state of a human character in text is a major challenge in natural language processing.
no code implementations • 26 Oct 2020 • Yosuke Higuchi, Hirofumi Inaguma, Shinji Watanabe, Tetsuji Ogawa, Tetsunori Kobayashi
While Mask-CTC achieves remarkably fast inference speed, its recognition performance falls behind that of conventional autoregressive (AR) systems.
no code implementations • 18 May 2020 • Yosuke Higuchi, Shinji Watanabe, Nanxin Chen, Tetsuji Ogawa, Tetsunori Kobayashi
In this work, Mask CTC model is trained using a Transformer encoder-decoder with joint training of mask prediction and CTC.
Audio and Speech Processing Sound
no code implementations • 21 Jan 2020 • Koki Madono, Masayuki Tanaka, Masaki Onishi, Tetsuji Ogawa
In this study, a perceptually hidden object-recognition method is investigated to generate secure images recognizable by humans but not machines.