Search Results for author: Adriana Stan

Found 12 papers, 1 papers with code

An analysis of large speech models-based representations for speech emotion recognition

no code implementations • 1 Nov 2023 • Adrian Bogdan Stânea, Vlad Striletchi, Cosmin Striletchi, Adriana Stan

Large speech models-derived features have recently shown increased performance over signal-based features across multiple downstream tasks, even when the networks are not finetuned towards the target task.

Speech Emotion Recognition

Paper
Add Code

Towards generalisable and calibrated synthetic speech detection with self-supervised representations

no code implementations • 11 Sep 2023 • Dan Oneata, Adriana Stan, Octavian Pascu, Elisabeta Oneata, Horia Cucu

Generalisation -- the ability of a model to perform well on unseen data -- is crucial for building reliable deep fake detectors.

Synthetic Speech Detection

Paper
Add Code

An analysis on the effects of speaker embedding choice in non auto-regressive TTS

no code implementations • 19 Jul 2023 • Adriana Stan, Johannah O'Mahony

In this paper we introduce a first attempt on understanding how a non-autoregressive factorised multi-speaker speech synthesis architecture exploits the information present in different speaker embedding sets.

Representation Learning Speech Synthesis

Paper
Add Code

Residual Information in Deep Speaker Embedding Architectures

no code implementations • 6 Feb 2023 • Adriana Stan

This means that the embeddings are far from ideal, highly dependent on the training corpus and still include a degree of residual information pertaining to factors such as linguistic content, recording conditions or speaking style of the utterance.

Paper
Add Code

The ZevoMOS entry to VoiceMOS Challenge 2022

no code implementations • 15 Jun 2022 • Adriana Stan

This paper introduces the ZevoMOS entry to the main track of the VoiceMOS Challenge 2022.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

FlexLip: A Controllable Text-to-Lip System

no code implementations • 7 Jun 2022 • Dan Oneata, Beata Lorincz, Adriana Stan, Horia Cucu

This modularity enables the easy replacement of each of its components, while also ensuring the fast adaptation to new speaker identities by disentangling or projecting the input features.

Audio Generation Text-to-Video Generation +1

Paper
Add Code

Speaker verification-derived loss and data augmentation for DNN-based multispeaker speech synthesis

no code implementations • 3 Jun 2021 • Beata Lorincz, Adriana Stan, Mircea Giurgiu

Building multispeaker neural network-based text-to-speech synthesis systems commonly relies on the availability of large amounts of high quality recordings from each speaker and conditioning the training process on the speaker's identity or on a learned representation of it.

Data Augmentation Speaker Verification +2

Paper
Add Code

An objective evaluation of the effects of recording conditions and speaker characteristics in multi-speaker deep neural speech synthesis

no code implementations • 3 Jun 2021 • Beata Lorincz, Adriana Stan, Mircea Giurgiu

The visualisation of the t-SNE projections of the natural and synthesised speaker embeddings show that the acoustic model shifts some of the speakers' neural representation, but not all of them.

Speaker Verification Speech Synthesis +1

Paper
Add Code

Speaker disentanglement in video-to-speech conversion

1 code implementation • 20 May 2021 • Dan Oneata, Adriana Stan, Horia Cucu

The task of video-to-speech aims to translate silent video of lip movement to its corresponding audio signal.

Disentanglement Speech Synthesis

Paper
Code

An evaluation of word-level confidence estimation for end-to-end automatic speech recognition

no code implementations • 14 Jan 2021 • Dan Oneata, Alexandru Caranica, Adriana Stan, Horia Cucu

In this paper we investigate confidence estimation for end-to-end automatic speech recognition (ASR).

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Paper
Add Code

RECOApy: Data recording, pre-processing and phonetic transcription for end-to-end speech-based applications

no code implementations • 11 Sep 2020 • Adriana Stan

RECOApy streamlines the steps of data recording and pre-processing required in end-to-end speech-based applications.

Paper
Add Code

RSS-TOBI - A Prosodically Enhanced Romanian Speech Corpus

no code implementations • LREC 2014 • Tiberiu Boro{\textcommabelow{s}}, Adriana Stan, Oliver Watts, Stefan Daniel Dumitrescu

This paper introduces a recent development of a Romanian Speech corpus to include prosodic annotations of the speech data in the form of ToBI labels.

Speech Synthesis Text-To-Speech Synthesis

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.