Search Results for author: Kohei Matsuura

Found 15 papers, 2 papers with code

Alignment-Free Training for Transducer-based Multi-Talker ASR

no code implementations30 Sep 2024 Takafumi Moriya, Shota Horiguchi, Marc Delcroix, Ryo Masumura, Takanori Ashihara, Hiroshi Sato, Kohei Matsuura, Masato Mimura

Thus, MT-RNNT-AFT can be trained without relying on accurate alignments, and it can recognize all speakers' speech with just one round of encoder processing.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Sentence-wise Speech Summarization: Task, Datasets, and End-to-End Modeling with LM Knowledge Distillation

no code implementations1 Aug 2024 Kohei Matsuura, Takanori Ashihara, Takafumi Moriya, Masato Mimura, Takatomo Kano, Atsunori Ogawa, Marc Delcroix

Using these datasets, our study evaluates two types of Transformer-based models: 1) cascade models that combine ASR and strong text summarization models, and 2) end-to-end (E2E) models that directly convert speech into a text summary.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +4

What Do Self-Supervised Speech and Speaker Models Learn? New Findings From a Cross Model Layer-Wise Analysis

no code implementations31 Jan 2024 Takanori Ashihara, Marc Delcroix, Takafumi Moriya, Kohei Matsuura, Taichi Asami, Yusuke Ijima

Our analysis unveils that 1) the capacity to represent content information is somewhat unrelated to enhanced speaker representation, 2) specific layers of speech SSL models would be partly specialized in capturing linguistic information, and 3) speaker SSL models tend to disregard linguistic information but exhibit more sophisticated speaker representation.

Self-Supervised Learning

Exploration of Language Dependency for Japanese Self-Supervised Speech Representation Models

no code implementations9 May 2023 Takanori Ashihara, Takafumi Moriya, Kohei Matsuura, Tomohiro Tanaka

However, since the two settings have been studied individually in general, there has been little research focusing on how effective a cross-lingual model is in comparison with a monolingual model.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Leveraging Large Text Corpora for End-to-End Speech Summarization

no code implementations2 Mar 2023 Kohei Matsuura, Takanori Ashihara, Takafumi Moriya, Tomohiro Tanaka, Atsunori Ogawa, Marc Delcroix, Ryo Masumura

The first technique is to utilize a text-to-speech (TTS) system to generate synthesized speech, which is used for E2E SSum training with the text summary.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Deep versus Wide: An Analysis of Student Architectures for Task-Agnostic Knowledge Distillation of Self-Supervised Speech Models

no code implementations14 Jul 2022 Takanori Ashihara, Takafumi Moriya, Kohei Matsuura, Tomohiro Tanaka

We investigate the performance on SUPERB while varying the structure and KD methods so as to keep the number of parameters constant; this allows us to analyze the contribution of the representation introduced by varying the model architecture.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +4

Cannot find the paper you are looking for? You can Submit a new open access paper.