Search Results for author: Rintaro Ikeshita

Found 11 papers, 0 papers with code

Rethinking Processing Distortions: Disentangling the Impact of Speech Enhancement Errors on Speech Recognition Performance

no code implementations • 23 Apr 2024 • Tsubasa Ochiai, Kazuma Iwamoto, Marc Delcroix, Rintaro Ikeshita, Hiroshi Sato, Shoko Araki, Shigeru Katagiri

To this end, we propose a novel analysis scheme based on the orthogonal projection-based decomposition of SE errors.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

How does end-to-end speech recognition training impact speech enhancement artifacts?

no code implementations • 20 Nov 2023 • Kazuma Iwamoto, Tsubasa Ochiai, Marc Delcroix, Rintaro Ikeshita, Hiroshi Sato, Shoko Araki, Shigeru Katagiri

Jointly training a speech enhancement (SE) front-end and an automatic speech recognition (ASR) back-end has been investigated as a way to mitigate the influence of \emph{processing distortion} generated by single-channel SE on ASR.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

Neural network-based virtual microphone estimation with virtual microphone and beamformer-level multi-task loss

no code implementations • 20 Nov 2023 • Hanako Segawa, Tsubasa Ochiai, Marc Delcroix, Tomohiro Nakatani, Rintaro Ikeshita, Shoko Araki, Takeshi Yamada, Shoji Makino

However, this training objective may not be optimal for a specific array processing back-end, such as beamforming.

Paper
Add Code

ISS2: An Extension of Iterative Source Steering Algorithm for Majorization-Minimization-Based Independent Vector Analysis

no code implementations • 2 Feb 2022 • Rintaro Ikeshita, Tomohiro Nakatani

Although the time complexity per iteration of ISS is $m$ times smaller than that of IP, the conventional ISS converges slower than the current fastest IP (called $\text{IP}_2$) that updates two rows of $W$ in each iteration.

Paper
Add Code

How Bad Are Artifacts?: Analyzing the Impact of Speech Enhancement Errors on ASR

no code implementations • 18 Jan 2022 • Kazuma Iwamoto, Tsubasa Ochiai, Marc Delcroix, Rintaro Ikeshita, Hiroshi Sato, Shoko Araki, Shigeru Katagiri

The artifact component is defined as the SE error signal that cannot be represented as a linear combination of speech and noise sources.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

Switching Independent Vector Analysis and Its Extension to Blind and Spatially Guided Convolutional Beamforming Algorithms

no code implementations • 20 Nov 2021 • Tomohiro Nakatani, Rintaro Ikeshita, Keisuke Kinoshita, Hiroshi Sawada, Naoyuki Kamo, Shoko Araki

This paper develops a framework that can perform denoising, dereverberation, and source separation accurately by using a relatively small number of microphones.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Paper
Add Code

Blind and neural network-guided convolutional beamformer for joint denoising, dereverberation, and source separation

no code implementations • 4 Aug 2021 • Tomohiro Nakatani, Rintaro Ikeshita, Keisuke Kinoshita, Hiroshi Sawada, Shoko Araki

This paper proposes an approach for optimizing a Convolutional BeamFormer (CBF) that can jointly perform denoising (DN), dereverberation (DR), and source separation (SS).

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

Independent Vector Extraction for Fast Joint Blind Source Separation and Dereverberation

no code implementations • 9 Feb 2021 • Rintaro Ikeshita, Tomohiro Nakatani

We address a blind source separation (BSS) problem in a noisy reverberant environment in which the number of microphones $M$ is greater than the number of sources of interest, and the other noise components can be approximated as stationary and Gaussian distributed.

blind source separation

Paper
Add Code

A Joint Diagonalization Based Efficient Approach to Underdetermined Blind Audio Source Separation Using the Multichannel Wiener Filter

no code implementations • 21 Jan 2021 • Nobutaka Ito, Rintaro Ikeshita, Hiroshi Sawada, Tomohiro Nakatani

Based on this approach, we present FastFCA, a computationally efficient extension of FCA.

Audio Source Separation Sound Audio and Speech Processing

Paper
Add Code

Neural Network-based Virtual Microphone Estimator

no code implementations • 12 Jan 2021 • Tsubasa Ochiai, Marc Delcroix, Tomohiro Nakatani, Rintaro Ikeshita, Keisuke Kinoshita, Shoko Araki

Developing microphone array technologies for a small number of microphones is important due to the constraints of many devices.

Speech Enhancement

Paper
Add Code

Block Coordinate Descent Algorithms for Auxiliary-Function-Based Independent Vector Extraction

no code implementations • 18 Oct 2020 • Rintaro Ikeshita, Tomohiro Nakatani, Shoko Araki

We also newly develop a BCD for a semiblind IVE in which the transfer functions for several super-Gaussian sources are given a priori.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.