Search Results for author: Kohei Yatabe

Found 21 papers, 4 papers with code

Sampling-Frequency-Independent Universal Sound Separation

no code implementations • 22 Sep 2023 • Tomohiko Nakamura, Kohei Yatabe

The USS aims at separating arbitrary sources of different types and can be the key technique to realize a source separator that can be universally used as a preprocessor for any downstream tasks.

Paper
Add Code

Versatile Time-Frequency Representations Realized by Convex Penalty on Magnitude Spectrogram

no code implementations • 3 Aug 2023 • Keidai Arai, Koki Yamada, Kohei Yatabe

Sparse time-frequency (T-F) representations have been an important research topic for more than several decades.

Paper
Add Code

LibriTTS-R: A Restored Multi-Speaker Text-to-Speech Corpus

no code implementations • 30 May 2023 • Yuma Koizumi, Heiga Zen, Shigeki Karita, Yifan Ding, Kohei Yatabe, Nobuyuki Morioka, Michiel Bacchiani, Yu Zhang, Wei Han, Ankur Bapna

The constituent samples of LibriTTS-R are identical to those of LibriTTS, with only the sound quality improved.

Paper
Add Code

Miipher: A Robust Speech Restoration Model Integrating Self-Supervised Speech and Text Representations

1 code implementation • 3 Mar 2023 • Yuma Koizumi, Heiga Zen, Shigeki Karita, Yifan Ding, Kohei Yatabe, Nobuyuki Morioka, Yu Zhang, Wei Han, Ankur Bapna, Michiel Bacchiani

Experiments show that Miipher (i) is robust against various audio degradation and (ii) enable us to train a high-quality text-to-speech (TTS) model from restored speech samples collected from the Web.

Speech Denoising Speech Enhancement

Paper
Code

WaveFit: An Iterative and Non-autoregressive Neural Vocoder based on Fixed-Point Iteration

no code implementations • 3 Oct 2022 • Yuma Koizumi, Kohei Yatabe, Heiga Zen, Michiel Bacchiani

The DDPMs and GANs can be characterized by the iterative denoising framework and adversarial training, respectively.

Denoising

Paper
Add Code

SpecGrad: Diffusion Probabilistic Model based Neural Vocoder with Adaptive Noise Spectral Shaping

no code implementations • 31 Mar 2022 • Yuma Koizumi, Heiga Zen, Kohei Yatabe, Nanxin Chen, Michiel Bacchiani

Neural vocoder using denoising diffusion probabilistic model (DDPM) has been improved by adaptation of the diffusion noise distribution to given acoustic features.

Denoising Speech Enhancement

Paper
Add Code

Wearable SELD dataset: Dataset for sound event localization and detection using wearable devices around head

1 code implementation • 17 Feb 2022 • Kento Nagatomo, Masahiro Yasuda, Kohei Yatabe, Shoichiro Saito, Yasuhiro Oikawa

Sound event localization and detection (SELD) is a combined task of identifying the sound event and its direction.

Sound Event Localization and Detection

Paper
Code

APPLADE: Adjustable Plug-and-play Audio Declipper Combining DNN with Sparse Optimization

no code implementations • 16 Feb 2022 • Tomoro Tanaka, Kohei Yatabe, Masahiro Yasuda, Yasuhiro Oikawa

Still, they cannot perform well if the training data have mismatches and/or constraints in the time domain are not imposed.

Audio declipping

Paper
Add Code

Design of Tight Minimum-Sidelobe Windows by Riemannian Newton's Method

no code implementations • 2 Nov 2021 • Daichi Kitahara, Kohei Yatabe

The short-time Fourier transform (STFT), or the discrete Gabor transform (DGT), has been extensively used in signal analysis and processing.

Paper
Add Code

Sampling-Frequency-Independent Audio Source Separation Using Convolution Layer Based on Impulse Invariant Method

1 code implementation • 10 May 2021 • Koichi Saito, Tomohiko Nakamura, Kohei Yatabe, Yuma Koizumi, Hiroshi Saruwatari

Audio source separation is often used as preprocessing of various applications, and one of its ultimate goals is to construct a single versatile model capable of dealing with the varieties of audio signals.

Audio Source Separation Music Source Separation

Paper
Code

Sparse time-frequency representation via atomic norm minimization

no code implementations • 7 May 2021 • Tsubasa Kusano, Kohei Yatabe, Yasuhiro Oikawa

In this paper, we propose a method of estimating a sparse T-F representation using atomic norm.

Paper
Add Code

Noisy-target Training: A Training Strategy for DNN-based Speech Enhancement without Clean Speech

no code implementations • 21 Jan 2021 • Takuya Fujimura, Yuma Koizumi, Kohei Yatabe, Ryoichi Miyazaki

This requirement currently restricts the amount of training data for speech enhancement to less than 1/1000 of that of speech recognition which does not need clean signals.

Speech Enhancement speech-recognition +1

Paper
Add Code

Self-supervised Neural Audio-Visual Sound Source Localization via Probabilistic Spatial Modeling

no code implementations • 28 Jul 2020 • Yoshiki Masuyama, Yoshiaki Bando, Kohei Yatabe, Yoko Sasaki, Masaki Onishi, Yasuhiro Oikawa

By incorporating with the spatial information in multichannel audio signals, our method trains deep neural networks (DNNs) to distinguish multiple sound source objects.

Self-Supervised Learning

Paper
Add Code

Gamma Boltzmann Machine for Simultaneously Modeling Linear- and Log-amplitude Spectra

no code implementations • 24 Jun 2020 • Toru Nakashika, Kohei Yatabe

Its conditional distribution of the observable data is given by the gamma distribution, and thus the proposed RBM can naturally handle the data represented by positive numbers as the amplitude spectra.

Paper
Add Code

Consistent ICA: Determined BSS meets spectrogram consistency

no code implementations • 20 May 2020 • Kohei Yatabe

Multichannel audio blind source separation (BSS) in the determined situation (the number of microphones is equal to that of the sources), or determined BSS, is performed by multichannel linear filtering in the time-frequency domain to handle the convolutive mixing process.

blind source separation

Paper
Add Code

Determined BSS based on time-frequency masking and its application to harmonic vector analysis

no code implementations • 29 Apr 2020 • Kohei Yatabe, Daichi Kitamura

This paper proposes harmonic vector analysis (HVA) based on a general algorithmic framework of audio blind source separation (BSS) that is also presented in this paper.

blind source separation

Paper
Add Code

Phase reconstruction based on recurrent phase unwrapping with deep neural networks

no code implementations • 14 Feb 2020 • Yoshiki Masuyama, Kohei Yatabe, Yuma Koizumi, Yasuhiro Oikawa, Noboru Harada

In the proposed method, DNNs estimate phase derivatives instead of phase itself, which allows us to avoid the sensitivity problem.

Audio Synthesis

Paper
Add Code

Speech Enhancement using Self-Adaptation and Multi-Head Self-Attention

no code implementations • 14 Feb 2020 • Yuma Koizumi, Kohei Yatabe, Marc Delcroix, Yoshiki Masuyama, Daiki Takeuchi

This paper investigates a self-adaptation method for speech enhancement using auxiliary speaker-aware features; we extract a speaker representation used for adaptation directly from the test utterance.

Multi-Task Learning Speaker Identification +3

Paper
Add Code

Stable Training of DNN for Speech Enhancement based on Perceptually-Motivated Black-Box Cost Function

no code implementations • 14 Feb 2020 • Masaki Kawanaka, Yuma Koizumi, Ryoichi Miyazaki, Kohei Yatabe

For evaluating the subjective quality, several methods related to perceptually-motivated objective sound quality assessment (OSQA) have been proposed such as PESQ (perceptual evaluation of speech quality).

Speech Enhancement

Paper
Add Code

Invertible DNN-based nonlinear time-frequency transform for speech enhancement

1 code implementation • 25 Nov 2019 • Daiki Takeuchi, Kohei Yatabe, Yuma Koizumi, Yasuhiro Oikawa, Noboru Harada

Therefore, some end-to-end methods used a DNN to learn the linear T-F transform which is much easier to understand.

Audio and Speech Processing Sound

Paper
Code

Deep Griffin-Lim Iteration

no code implementations • 10 Mar 2019 • Yoshiki Masuyama, Kohei Yatabe, Yuma Koizumi, Yasuhiro Oikawa, Noboru Harada

This paper presents a novel phase reconstruction method (only from a given amplitude spectrogram) by combining a signal-processing-based approach and a deep neural network (DNN).

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.