no code implementations • 13 Feb 2024 • Shiqi Zhang, Zheng Qiu, Daiki Takeuchi, Noboru Harada, Shoji Makino
With the rapid development of neural networks in recent years, the ability of various networks to enhance the magnitude spectrum of noisy speech in the single-channel speech enhancement domain has become exceptionally outstanding.
no code implementations • 14 Dec 2023 • Kunxing Lu, Xianrui Wang, Tetsuya Ueda, Shoji Makino, Jingdong Chen
While the semi-blind source separation-based acoustic echo cancellation (SBSS-AEC) has received much research attention due to its promising performance during double-talk compared to the traditional adaptive algorithms, it suffers from system latency and nonlinear distortions.
no code implementations • 20 Nov 2023 • Hanako Segawa, Tsubasa Ochiai, Marc Delcroix, Tomohiro Nakatani, Rintaro Ikeshita, Shoko Araki, Takeshi Yamada, Shoji Makino
However, this training objective may not be optimal for a specific array processing back-end, such as beamforming.
no code implementations • IEEE Access 2022 • JENNIFER SANTOSO, Takeshi Yamada, Kenkichi Ishizuka, Taiichi Hashimoto, Shoji Makino
Although there is a method to improve ASR performance in the presence of emotional speech, it requires the fine-tuning of ASR, which has a high computational cost and leads to the loss of cues important for determining the presence of emotion in speech segments, which can be helpful in SER.
Ranked #4 on Multimodal Emotion Recognition on IEMOCAP
Multimodal Emotion Recognition Speech Emotion Recognition +2
no code implementations • 16 Dec 2018 • Li Li, Hirokazu Kameoka, Shoji Makino
While MVAE is notable in its impressive source separation performance, the convergence-guaranteed optimization algorithm and that it allows us to estimate source-class labels simultaneously with source separation, there are still two major drawbacks, i. e., the high computational complexity and unsatisfactory source classification accuracy.
1 code implementation • 2 Aug 2018 • Hirokazu Kameoka, Li Li, Shota Inoue, Shoji Makino
This paper proposes a multichannel source separation technique called the multichannel variational autoencoder (MVAE) method, which uses a conditional VAE (CVAE) to model and estimate the power spectrograms of the sources in a mixture.