Search Results for author: Adriana Stan

Found 12 papers, 1 papers with code

An analysis of large speech models-based representations for speech emotion recognition

no code implementations1 Nov 2023 Adrian Bogdan Stânea, Vlad Striletchi, Cosmin Striletchi, Adriana Stan

Large speech models-derived features have recently shown increased performance over signal-based features across multiple downstream tasks, even when the networks are not finetuned towards the target task.

Speech Emotion Recognition

Towards generalisable and calibrated synthetic speech detection with self-supervised representations

no code implementations11 Sep 2023 Dan Oneata, Adriana Stan, Octavian Pascu, Elisabeta Oneata, Horia Cucu

Generalisation -- the ability of a model to perform well on unseen data -- is crucial for building reliable deep fake detectors.

Synthetic Speech Detection

An analysis on the effects of speaker embedding choice in non auto-regressive TTS

no code implementations19 Jul 2023 Adriana Stan, Johannah O'Mahony

In this paper we introduce a first attempt on understanding how a non-autoregressive factorised multi-speaker speech synthesis architecture exploits the information present in different speaker embedding sets.

Representation Learning Speech Synthesis

Residual Information in Deep Speaker Embedding Architectures

no code implementations6 Feb 2023 Adriana Stan

This means that the embeddings are far from ideal, highly dependent on the training corpus and still include a degree of residual information pertaining to factors such as linguistic content, recording conditions or speaking style of the utterance.

FlexLip: A Controllable Text-to-Lip System

no code implementations7 Jun 2022 Dan Oneata, Beata Lorincz, Adriana Stan, Horia Cucu

This modularity enables the easy replacement of each of its components, while also ensuring the fast adaptation to new speaker identities by disentangling or projecting the input features.

Audio Generation Text-to-Video Generation +1

Speaker verification-derived loss and data augmentation for DNN-based multispeaker speech synthesis

no code implementations3 Jun 2021 Beata Lorincz, Adriana Stan, Mircea Giurgiu

Building multispeaker neural network-based text-to-speech synthesis systems commonly relies on the availability of large amounts of high quality recordings from each speaker and conditioning the training process on the speaker's identity or on a learned representation of it.

Data Augmentation Speaker Verification +2

An objective evaluation of the effects of recording conditions and speaker characteristics in multi-speaker deep neural speech synthesis

no code implementations3 Jun 2021 Beata Lorincz, Adriana Stan, Mircea Giurgiu

The visualisation of the t-SNE projections of the natural and synthesised speaker embeddings show that the acoustic model shifts some of the speakers' neural representation, but not all of them.

Speaker Verification Speech Synthesis +1

Speaker disentanglement in video-to-speech conversion

1 code implementation20 May 2021 Dan Oneata, Adriana Stan, Horia Cucu

The task of video-to-speech aims to translate silent video of lip movement to its corresponding audio signal.

Disentanglement Speech Synthesis

RECOApy: Data recording, pre-processing and phonetic transcription for end-to-end speech-based applications

no code implementations11 Sep 2020 Adriana Stan

RECOApy streamlines the steps of data recording and pre-processing required in end-to-end speech-based applications.

RSS-TOBI - A Prosodically Enhanced Romanian Speech Corpus

no code implementations LREC 2014 Tiberiu Boro{\textcommabelow{s}}, Adriana Stan, Oliver Watts, Stefan Daniel Dumitrescu

This paper introduces a recent development of a Romanian Speech corpus to include prosodic annotations of the speech data in the form of ToBI labels.

Speech Synthesis Text-To-Speech Synthesis

Cannot find the paper you are looking for? You can Submit a new open access paper.