no code implementations • 9 Sep 2024 • Naoyuki Kamo, Naohiro Tawara, Atsushi Ando, Takatomo Kano, Hiroshi Sato, Rintaro Ikeshita, Takafumi Moriya, Shota Horiguchi, Kohei Matsuura, Atsunori Ogawa, Alexis Plaquet, Takanori Ashihara, Tsubasa Ochiai, Masato Mimura, Marc Delcroix, Tomohiro Nakatani, Taichi Asami, Shoko Araki
We present a distant automatic speech recognition (DASR) system developed for the CHiME-8 DASR track.
no code implementations • 1 Aug 2024 • Kohei Matsuura, Takanori Ashihara, Takafumi Moriya, Masato Mimura, Takatomo Kano, Atsunori Ogawa, Marc Delcroix
Using these datasets, our study evaluates two types of Transformer-based models: 1) cascade models that combine ASR and strong text summarization models, and 2) end-to-end (E2E) models that directly convert speech into a text summary.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +4
no code implementations • 27 Jun 2024 • Atsunori Ogawa, Naoyuki Kamo, Kohei Matsuura, Takanori Ashihara, Takafumi Moriya, Takatomo Kano, Naohiro Tawara, Marc Delcroix
We investigate the effects of domain adaptation of the LLM and context carry-over when performing N-best rescoring.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
no code implementations • 22 Dec 2023 • Atsunori Ogawa, Naohiro Tawara, Takatomo Kano, Marc Delcroix
Confidence estimation, in which we estimate the reliability of each recognized token (e. g., word, sub-word, and character) in automatic speech recognition (ASR) hypotheses and detect incorrectly recognized tokens, is an important function for developing ASR applications.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1
no code implementations • 20 Dec 2023 • Atsunori Ogawa, Naohiro Tawara, Marc Delcroix, Shoko Araki
We investigate the effectiveness of using a large ensemble of advanced neural language models (NLMs) for lattice rescoring on automatic speech recognition (ASR) hypotheses.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1
no code implementations • 17 Oct 2023 • Atsunori Ogawa, Takafumi Moriya, Naoyuki Kamo, Naohiro Tawara, Marc Delcroix
In experiments using an attention-based encoder-decoder ASR system, we confirmed that ISF using the PBLM shows comparable performance with SF using the FLM.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +4
no code implementations • 22 Sep 2023 • Naohiro Tawara, Marc Delcroix, Atsushi Ando, Atsunori Ogawa
This paper details our speaker diarization system designed for multi-domain, multi-microphone casual conversations.
no code implementations • 7 Jun 2023 • Kohei Matsuura, Takanori Ashihara, Takafumi Moriya, Tomohiro Tanaka, Takatomo Kano, Atsunori Ogawa, Marc Delcroix
End-to-end speech summarization (E2E SSum) directly summarizes input speech into easy-to-read short sentences with a single model.
no code implementations • 25 May 2023 • Takafumi Moriya, Hiroshi Sato, Tsubasa Ochiai, Marc Delcroix, Takanori Ashihara, Kohei Matsuura, Tomohiro Tanaka, Ryo Masumura, Atsunori Ogawa, Taichi Asami
Neural transducer (RNNT)-based target-speaker speech recognition (TS-RNNT) directly transcribes a target speaker's voice from a multi-talker mixture.
no code implementations • 23 May 2023 • Marc Delcroix, Naohiro Tawara, Mireia Diez, Federico Landini, Anna Silnova, Atsunori Ogawa, Tomohiro Nakatani, Lukas Burget, Shoko Araki
Combining end-to-end neural speaker diarization (EEND) with vector clustering (VC), known as EEND-VC, has gained interest for leveraging the strengths of both methods.
no code implementations • 2 Mar 2023 • Kohei Matsuura, Takanori Ashihara, Takafumi Moriya, Tomohiro Tanaka, Atsunori Ogawa, Marc Delcroix, Ryo Masumura
The first technique is to utilize a text-to-speech (TTS) system to generate synthesized speech, which is used for E2E SSum training with the text summary.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +3
2 code implementations • 16 Nov 2021 • Takatomo Kano, Atsunori Ogawa, Marc Delcroix, Shinji Watanabe
We propose a cascade speech summarization model that is robust to ASR errors and that exploits multiple hypotheses generated by ASR to attenuate the effect of ASR errors on the summary.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
no code implementations • 17 Apr 2021 • Ayako Yamamoto, Toshio Irino, Kenichi Arai, Shoko Araki, Atsunori Ogawa, Keisuke Kinoshita, Tomohiro Nakatani
Many subjective experiments have been performed to develop objective speech intelligibility measures, but the novel coronavirus outbreak has made it very difficult to conduct experiments in a laboratory.