Search Results for author: Paavo Alku

Found 17 papers, 3 papers with code

Speech waveform synthesis from MFCC sequences with generative adversarial networks

1 code implementation3 Apr 2018 Lauri Juvela, Bajibabu Bollepalli, Xin Wang, Hirokazu Kameoka, Manu Airaksinen, Junichi Yamagishi, Paavo Alku

This paper proposes a method for generating speech from filterbank mel frequency cepstral coefficients (MFCC), which are widely used in speech applications, such as ASR, but are generally considered unusable for speech synthesis.

Generative Adversarial Network Speech Synthesis

Speaker-independent raw waveform model for glottal excitation

no code implementations25 Apr 2018 Lauri Juvela, Vassilis Tsiaras, Bajibabu Bollepalli, Manu Airaksinen, Junichi Yamagishi, Paavo Alku

Recent speech technology research has seen a growing interest in using WaveNets as statistical vocoders, i. e., generating speech waveforms from acoustic features.

Speech Synthesis Text-To-Speech Synthesis +1

Waveform generation for text-to-speech synthesis using pitch-synchronous multi-scale generative adversarial networks

no code implementations30 Oct 2018 Lauri Juvela, Bajibabu Bollepalli, Junichi Yamagishi, Paavo Alku

The state-of-the-art in text-to-speech synthesis has recently improved considerably due to novel neural waveform generation methods, such as WaveNet.

Image Generation Speech Synthesis +2

Generative adversarial network-based glottal waveform model for statistical parametric speech synthesis

no code implementations14 Mar 2019 Bajibabu Bollepalli, Lauri Juvela, Paavo Alku

The results show that the newly proposed GANs achieve synthesis quality comparable to that of widely-used DNNs, without using an additive noise component.

Generative Adversarial Network Speech Synthesis +1

GELP: GAN-Excited Linear Prediction for Speech Synthesis from Mel-spectrogram

1 code implementation8 Apr 2019 Lauri Juvela, Bajibabu Bollepalli, Junichi Yamagishi, Paavo Alku

Recent advances in neural network -based text-to-speech have reached human level naturalness in synthetic speech.

Speech Synthesis

Glottal Source Processing: from Analysis to Applications

no code implementations29 Dec 2019 Thomas Drugman, Paavo Alku, Abeer Alwan, Bayya Yegnanarayana

The great majority of current voice technology applications relies on acoustic features characterizing the vocal tract response, such as the widely used MFCC of LPC parameters.

Formant Tracking Using Quasi-Closed Phase Forward-Backward Linear Prediction Analysis and Deep Neural Networks

no code implementations5 Jan 2022 Dhananjaya Gowda, Bajibabu Bollepalli, Sudarsana Reddy Kadiri, Paavo Alku

Formant tracking is investigated in this study by using trackers based on dynamic programming (DP) and deep neural nets (DNNs).

Investigation of Self-supervised Pre-trained Models for Classification of Voice Quality from Speech and Neck Surface Accelerometer Signals

no code implementations6 Aug 2023 Sudarsana Reddy Kadiri, Farhad Javanmardi, Paavo Alku

Between the features, the pre-trained model-based features showed better classification accuracies, both for speech and NSA inputs compared to the conventional features.

Classification

Severity Classification of Parkinson's Disease from Speech using Single Frequency Filtering-based Features

no code implementations17 Aug 2023 Sudarsana Reddy Kadiri, Manila Kodali, Paavo Alku

Developing objective methods for assessing the severity of Parkinson's disease (PD) is crucial for improving the diagnosis and treatment.

Sentence

Refining a Deep Learning-based Formant Tracker using Linear Prediction Methods

no code implementations17 Aug 2023 Paavo Alku, Sudarsana Reddy Kadiri, Dhananjaya Gowda

The results indicated that the data-driven DeepFormants trackers outperformed the conventional trackers and that the best performance was obtained by refining the formants predicted by DeepFormants using QCP-FB analysis.

Time-Varying Quasi-Closed-Phase Analysis for Accurate Formant Tracking in Speech Signals

1 code implementation31 Aug 2023 Dhananjaya Gowda, Sudarsana Reddy Kadiri, Brad Story, Paavo Alku

Formant tracking experiments with a wide variety of synthetic and natural speech signals show that the proposed TVQCP method performs better than conventional and popular formant tracking tools, such as Wavesurfer and Praat (based on dynamic programming), the KARMA algorithm (based on Kalman filtering), and DeepFormants (based on deep neural networks trained in a supervised manner).

Wav2vec-based Detection and Severity Level Classification of Dysarthria from Speech

no code implementations25 Sep 2023 Farhad Javanmardi, Saska Tirronen, Manila Kodali, Sudarsana Reddy Kadiri, Paavo Alku

Automatic detection and severity level classification of dysarthria directly from acoustic speech signals can be used as a tool in medical diagnosis.

Classification Medical Diagnosis

Analysis and Detection of Pathological Voice using Glottal Source Features

no code implementations25 Sep 2023 Sudarsana Reddy Kadiri, Paavo Alku

From the detection experiments it was observed that the performance achieved with the studied glottal source features is comparable or better than that of conventional MFCCs and perceptual linear prediction (PLP) features.

Cannot find the paper you are looking for? You can Submit a new open access paper.