Search Results for author: Shusuke Takahashi

Found 13 papers, 6 papers with code

STARSS22: A dataset of spatial recordings of real scenes with spatiotemporal annotations of sound events

2 code implementations4 Jun 2022 Archontis Politis, Kazuki Shimada, Parthasaarathy Sudarsanam, Sharath Adavanne, Daniel Krause, Yuichiro Koyama, Naoya Takahashi, Shusuke Takahashi, Yuki Mitsufuji, Tuomas Virtanen

Additionally, the report presents the baseline system that accompanies the dataset in the challenge with emphasis on the differences with the baseline of the previous iterations; namely, introduction of the multi-ACCDOA representation to handle multiple simultaneous occurences of events of the same class, and support for additional improved input features for the microphone array format.

Sound Event Localization and Detection

SQ-VAE: Variational Bayes on Discrete Representation with Self-annealed Stochastic Quantization

1 code implementation16 May 2022 Yuhta Takida, Takashi Shibuya, WeiHsiang Liao, Chieh-Hsin Lai, Junki Ohmura, Toshimitsu Uesaka, Naoki Murata, Shusuke Takahashi, Toshiyuki Kumakura, Yuki Mitsufuji

In this paper, we propose a new training scheme that extends the standard VAE via novel stochastic dequantization and quantization, called stochastically quantized variational autoencoder (SQ-VAE).

Quantization

Improving Character Error Rate Is Not Equal to Having Clean Speech: Speech Enhancement for ASR Systems with Black-box Acoustic Models

no code implementations12 Oct 2021 Ryosuke Sawata, Yosuke Kashiwagi, Shusuke Takahashi

In order to optimize the DNN-based SE model in terms of the character error rate (CER), which is one of the metric to evaluate the ASR system and generally non-differentiable, our method uses two DNNs: one for speech processing and one for mimicking the output CERs derived through an acoustic model (AM).

Automatic Speech Recognition Speech Enhancement +1

Manifold-Aware Deep Clustering: Maximizing Angles between Embedding Vectors Based on Regular Simplex

no code implementations4 Jun 2021 Keitaro Tanaka, Ryosuke Sawata, Shusuke Takahashi

This paper presents a new deep clustering (DC) method called manifold-aware DC (M-DC) that can enhance hyperspace utilization more effectively than the original DC.

Deep Clustering

Preventing Oversmoothing in VAE via Generalized Variance Parameterization

no code implementations17 Feb 2021 Yuhta Takida, Wei-Hsiang Liao, Chieh-Hsin Lai, Toshimitsu Uesaka, Shusuke Takahashi, Yuki Mitsufuji

Variational autoencoders (VAEs) often suffer from posterior collapse, which is a phenomenon in which the learned latent space becomes uninformative.

AR-ELBO: Preventing Posterior Collapse Induced by Oversmoothing in Gaussian VAE

no code implementations1 Jan 2021 Yuhta Takida, Wei-Hsiang Liao, Toshimitsu Uesaka, Shusuke Takahashi, Yuki Mitsufuji

Variational autoencoders (VAEs) often suffer from posterior collapse, which is a phenomenon that the learned latent space becomes uninformative.

All for One and One for All: Improving Music Separation by Bridging Networks

5 code implementations8 Oct 2020 Ryosuke Sawata, Stefan Uhlich, Shusuke Takahashi, Yuki Mitsufuji

This paper proposes several improvements for music separation with deep neural networks (DNNs), namely a multi-domain loss (MDL) and two combination schemes.

Music Source Separation

Cannot find the paper you are looking for? You can Submit a new open access paper.