no code implementations • 14 Aug 2023 • Takuhiro Kaneko, Hirokazu Kameoka, Kou Tanaka, Shogo Seki
Owing to the difficulty of a 1D CNN to model high-dimensional spectrograms, the frequency dimension is reduced via temporal upsampling.
no code implementations • 24 Mar 2023 • Takuhiro Kaneko, Hirokazu Kameoka, Kou Tanaka, Shogo Seki
This architecture provides a generator with sufficiently rich information for the synthesized speech to be closely matched to the real speech.
1 code implementation • 4 Mar 2022 • Takuhiro Kaneko, Kou Tanaka, Hirokazu Kameoka, Shogo Seki
In recent text-to-speech synthesis and voice conversion systems, a mel-spectrogram is commonly applied as an intermediate representation, and the necessity for a mel-spectrogram vocoder is increasing.
no code implementations • 29 Sep 2018 • Shogo Seki, Hirokazu Kameoka, Li Li, Tomoki Toda, Kazuya Takeda
This paper deals with a multichannel audio source separation problem under underdetermined conditions.