Search Results for author: Yusuke Kida

Found 7 papers, 1 papers with code

Neural Diarization with Non-autoregressive Intermediate Attractors

1 code implementation13 Mar 2023 Yusuke Fujita, Tatsuya Komatsu, Robin Scheibler, Yusuke Kida, Tetsuji Ogawa

The experiments with the two-speaker CALLHOME dataset show that the intermediate labels with the proposed non-autoregressive intermediate attractors boost the diarization performance.

speaker-diarization Speaker Diarization

Speaker Selective Beamformer with Keyword Mask Estimation

no code implementations25 Oct 2018 Yusuke Kida, Dung Tran, Motoi Omachi, Toru Taniguchi, Yuya Fujita

The proposed method firstly utilizes a DNN-based mask estimator to separate the mixture signal into the keyword signal uttered by the target speaker and the remaining background speech.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Label-Synchronous Speech-to-Text Alignment for ASR Using Forward and Backward Transformers

no code implementations21 Apr 2021 Yusuke Kida, Tatsuya Komatsu, Masahito Togami

The speech-to-text alignment is a problem of splitting long audio recordings with un-aligned transcripts into utterance-wise pairs of speech and text.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Better Intermediates Improve CTC Inference

no code implementations1 Apr 2022 Tatsuya Komatsu, Yusuke Fujita, Jaesong Lee, Lukas Lee, Shinji Watanabe, Yusuke Kida

This paper proposes a method for improved CTC inference with searched intermediates and multi-pass conditioning.

InterAug: Augmenting Noisy Intermediate Predictions for CTC-based ASR

no code implementations1 Apr 2022 Yu Nakagome, Tatsuya Komatsu, Yusuke Fujita, Shuta Ichimura, Yusuke Kida

The proposed method exploits the conditioning framework of self-conditioned CTC to train robust models by conditioning with "noisy" intermediate predictions.

speech-recognition Speech Recognition

Cannot find the paper you are looking for? You can Submit a new open access paper.