Acoustic Modelling

10 papers with code • 0 benchmarks • 0 datasets

This task has no description! Would you like to contribute one?

Latest papers with no code

An overview of text-to-speech systems and media applications

no code yet • 22 Oct 2023

Producing synthetic voice, similar to human-like sound, is an emerging novelty of modern interactive media systems.

Comparing normalizing flows and diffusion models for prosody and acoustic modelling in text-to-speech

no code yet • 31 Jul 2023

Neural text-to-speech systems are often optimized on L1/L2 losses, which make strong assumptions about the distributions of the target data space.

Speaker- and Age-Invariant Training for Child Acoustic Modeling Using Adversarial Multi-Task Learning

no code yet • 19 Oct 2022

These high acoustic variations along with the scarcity of child speech corpora have impeded the development of a reliable speech recognition system for children.

Impact of Dataset on Acoustic Models for Automatic Speech Recognition

no code yet • 25 Mar 2022

The GMM models are widely used to create the alignments of the training data for the hybrid deep neural network model, thus making it an important task to create accurate alignments.

Investigation of Deep Neural Network Acoustic Modelling Approaches for Low Resource Accented Mandarin Speech Recognition

no code yet • 24 Jan 2022

Meanwhile, approaches of multi-accent modelling including multi-style training, multi-accent decision tree state tying, DNN tandem and multi-level adaptive network (MLAN) tandem hidden Markov model (HMM) modelling are combined and compared in this paper.

Common Phone: A Multilingual Dataset for Robust Acoustic Modelling

no code yet • LREC 2022

A Wav2Vec 2. 0 acoustic model was trained with the Common Phone to perform phonetic symbol recognition and validate the quality of the generated phonetic annotation.

Enhancing audio quality for expressive Neural Text-to-Speech

no code yet • 13 Aug 2021

Artificial speech synthesis has made a great leap in terms of naturalness as recent Text-to-Speech (TTS) systems are capable of producing speech with similar quality to human recordings.

Low Resource German ASR with Untranscribed Data Spoken by Non-native Children -- INTERSPEECH 2021 Shared Task SPAPL System

no code yet • 18 Jun 2021

~ 5 hours of transcribed data and ~ 60 hours of untranscribed data are provided to develop a German ASR system for children.

End-to-end acoustic modelling for phone recognition of young readers

no code yet • 4 Mar 2021

Through transfer learning, a Transformer model complemented with a Connectionist Temporal Classification (CTC) objective function, reaches a phone error rate of 28. 1%, outperforming a state-of-the-art DNN-HMM model by 6. 6% relative, as well as other end-to-end architectures by more than 8. 5% relative.

A comparative study of two-dimensional vocal tract acoustic modeling based on Finite-Difference Time-Domain methods

no code yet • 9 Feb 2021

The two-dimensional (2D) numerical approaches for vocal tract (VT) modelling can afford a better balance between the low computational cost and accurate rendering of acoustic wave propagation.