Search Results for author: Simon King

Found 18 papers, 4 papers with code

Natural language guidance of high-fidelity text-to-speech with synthetic annotations

no code implementations2 Feb 2024 Dan Lyth, Simon King

We propose a scalable method for labeling various aspects of speaker identity, style, and recording conditions.

In-Context Learning Language Modelling

Differentiable Grey-box Modelling of Phaser Effects using Frame-based Spectral Processing

no code implementations2 Jun 2023 Alistair Carson, Cassia Valentini-Botinhao, Simon King, Stefan Bilbao

The frame duration is an important hyper-parameter of the proposed model, so an investigation was carried out into its effect on model accuracy.

Controllable Speaking Styles Using a Large Language Model

no code implementations17 May 2023 Atli Thor Sigurgeirsson, Simon King

Given only a natural language query text (the prompt), such models can be used to solve specific, context-dependent tasks.

Language Modelling Large Language Model +1

Do Prosody Transfer Models Transfer Prosody?

no code implementations7 Mar 2023 Atli Thor Sigurgeirsson, Simon King

This is done by using a learned embedding of the reference utterance, which is used to condition speech generation.

Speech Synthesis Text-To-Speech Synthesis

Ctrl-P: Temporal Control of Prosodic Variation for Speech Synthesis

no code implementations15 Jun 2021 Devang S Ram Mohan, Vivian Hu, Tian Huey Teh, Alexandra Torresquintero, Christopher G. R. Wallis, Marlene Staib, Lorenzo Foglianti, Jiameng Gao, Simon King

Text does not fully specify the spoken form, so text-to-speech models must be able to learn from speech data that vary in ways not explained by the corresponding text.

Speech Synthesis

ADEPT: A Dataset for Evaluating Prosody Transfer

no code implementations15 Jun 2021 Alexandra Torresquintero, Tian Huey Teh, Christopher G. R. Wallis, Marlene Staib, Devang S Ram Mohan, Vivian Hu, Lorenzo Foglianti, Jiameng Gao, Simon King

Text-to-speech is now able to achieve near-human naturalness and research focus has shifted to increasing expressivity.

Using previous acoustic context to improve Text-to-Speech synthesis

no code implementations7 Dec 2020 Pilar Oplustil-Gallegos, Simon King

Many speech synthesis datasets, especially those derived from audiobooks, naturally comprise sequences of utterances.

Speech Synthesis Text-To-Speech Synthesis

Perception of prosodic variation for speech synthesis using an unsupervised discrete representation of F0

1 code implementation14 Mar 2020 Zack Hodari, Catherine Lai, Simon King

In English, prosody adds a broad range of information to segment sequences, from information structure (e. g. contrast) to stylistic variation (e. g. expression of emotion).

Clustering Representation Learning +1

Using generative modelling to produce varied intonation for speech synthesis

1 code implementation10 Jun 2019 Zack Hodari, Oliver Watts, Simon King

A generative model that can synthesise multiple prosodies will, by design, not model average prosody.

Sentence Speech Synthesis

Attentive Filtering Networks for Audio Replay Attack Detection

1 code implementation31 Oct 2018 Cheng-I Lai, Alberto Abad, Korin Richmond, Junichi Yamagishi, Najim Dehak, Simon King

In this work, we propose our replay attacks detection system - Attentive Filtering Network, which is composed of an attention-based filtering mechanism that enhances feature representations in both the frequency and time domains, and a ResNet-based classifier.

Speaker Verification

Median-Based Generation of Synthetic Speech Durations using a Non-Parametric Approach

no code implementations22 Aug 2016 Srikanth Ronanki, Oliver Watts, Simon King, Gustav Eje Henter

This paper proposes a new approach to duration modelling for statistical parametric speech synthesis in which a recurrent statistical model is trained to output a phone transition probability at each timestep (acoustic frame).

Speech Synthesis

DNN-based Speech Synthesis for Indian Languages from ASCII text

no code implementations18 Aug 2016 Srikanth Ronanki, Siva Reddy, Bajibabu Bollepalli, Simon King

These methods first convert the ASCII text to a phonetic script, and then learn a Deep Neural Network to synthesize speech from that.

Speech Synthesis Text-To-Speech Synthesis

Improving Trajectory Modelling for DNN-based Speech Synthesis by using Stacked Bottleneck Features and Minimum Generation Error Training

no code implementations22 Feb 2016 Zhizheng Wu, Simon King

We propose two novel techniques --- stacking bottleneck features and minimum generation error training criterion --- to improve the performance of deep neural network (DNN)-based speech synthesis.

Speech Synthesis

Investigating gated recurrent neural networks for speech synthesis

no code implementations11 Jan 2016 Zhizheng Wu, Simon King

Recently, recurrent neural networks (RNNs) as powerful sequence models have re-emerged as a potential acoustic model for statistical parametric speech synthesis (SPSS).

Speech Synthesis

Cannot find the paper you are looking for? You can Submit a new open access paper.