1 code implementation • 13 Mar 2023 • Yusuke Fujita, Tatsuya Komatsu, Robin Scheibler, Yusuke Kida, Tetsuji Ogawa
The experiments with the two-speaker CALLHOME dataset show that the intermediate labels with the proposed non-autoregressive intermediate attractors boost the diarization performance.
no code implementations • 25 Oct 2018 • Yusuke Kida, Dung Tran, Motoi Omachi, Toru Taniguchi, Yuya Fujita
The proposed method firstly utilizes a DNN-based mask estimator to separate the mixture signal into the keyword signal uttered by the target speaker and the remaining background speech.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1
no code implementations • 21 Apr 2021 • Yusuke Kida, Tatsuya Komatsu, Masahito Togami
The speech-to-text alignment is a problem of splitting long audio recordings with un-aligned transcripts into utterance-wise pairs of speech and text.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
no code implementations • 1 Apr 2022 • Tatsuya Komatsu, Yusuke Fujita, Jaesong Lee, Lukas Lee, Shinji Watanabe, Yusuke Kida
This paper proposes a method for improved CTC inference with searched intermediates and multi-pass conditioning.
no code implementations • 1 Apr 2022 • Yusuke Fujita, Tatsuya Komatsu, Yusuke Kida
End-to-end automatic speech recognition directly maps input speech to characters.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
no code implementations • 1 Apr 2022 • Yu Nakagome, Tatsuya Komatsu, Yusuke Fujita, Shuta Ichimura, Yusuke Kida
The proposed method exploits the conditioning framework of self-conditioned CTC to train robust models by conditioning with "noisy" intermediate predictions.
no code implementations • 19 Oct 2022 • Takato Yamazaki, Katsumasa Yoshikawa, Toshiki Kawamoto, Masaya Ohagi, Tomoya Mizumoto, Shuta Ichimura, Yusuke Kida, Toshinori Sato
This paper describes our system submitted to Dialogue Robot Competition 2022.