Search Results for author: Mayank Kumar Singh

Found 6 papers, 1 papers with code

Iteratively Improving Speech Recognition and Voice Conversion

no code implementations24 May 2023 Mayank Kumar Singh, Naoya Takahashi, Onoe Naoyuki

Many existing works on voice conversion (VC) tasks use automatic speech recognition (ASR) models for ensuring linguistic consistency between source and converted samples.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Nonparallel Emotional Voice Conversion For Unseen Speaker-Emotion Pairs Using Dual Domain Adversarial Network & Virtual Domain Pairing

no code implementations21 Feb 2023 Nirmesh Shah, Mayank Kumar Singh, Naoya Takahashi, Naoyuki Onoe

Primary goal of an emotional voice conversion (EVC) system is to convert the emotion of a given speech signal from one style to another style without modifying the linguistic content of the signal.

Voice Conversion

Robust One-Shot Singing Voice Conversion

no code implementations20 Oct 2022 Naoya Takahashi, Mayank Kumar Singh, Yuki Mitsufuji

We then propose a two-stage training method called Robustify that train the one-shot SVC model in the first stage on clean data to ensure high-quality conversion, and introduces enhancement modules to the encoders of the model in the second stage to enhance the feature extraction from distorted singing voices.

Voice Conversion

Hierarchical disentangled representation learning for singing voice conversion

no code implementations18 Jan 2021 Naoya Takahashi, Mayank Kumar Singh, Yuki Mitsufuji

Conventional singing voice conversion (SVC) methods often suffer from operating in high-resolution audio owing to a high dimensionality of data.

Representation Learning Voice Conversion

Improving Voice Separation by Incorporating End-to-end Speech Recognition

1 code implementation29 Nov 2019 Naoya Takahashi, Mayank Kumar Singh, Sakya Basak, Parthasaarathy Sudarsanam, Sriram Ganapathy, Yuki Mitsufuji

Despite recent advances in voice separation methods, many challenges remain in realistic scenarios such as noisy recording and the limits of available data.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Cannot find the paper you are looking for? You can Submit a new open access paper.