Search Results for author: Erik Visser

Found 7 papers, 0 papers with code

Stylebook: Content-Dependent Speaking Style Modeling for Any-to-Any Voice Conversion using Only Speech Data

no code implementations • 6 Sep 2023 • Hyungseob Lim, Kyungguen Byun, Sunkuk Moon, Erik Visser

Finally, content information extracted from the source speech and content-dependent target style embeddings are fed into a diffusion-based decoder to generate the converted speech mel-spectrogram.

Self-Supervised Learning Voice Conversion

Paper
Add Code

Parameter Efficient Audio Captioning With Faithful Guidance Using Audio-text Shared Latent Representation

no code implementations • 6 Sep 2023 • Arvind Krishna Sridhar, Yinyi Guo, Erik Visser, Rehana Mahfuz

Then, we propose a parameter efficient inference time faithful decoding algorithm that enables smaller audio captioning models with performance equivalent to larger models trained with more data.

Audio captioning Data Augmentation +4

Paper
Add Code

Improved Beam Search for Hallucination Mitigation in Abstractive Summarization

no code implementations • 6 Dec 2022 • Arvind Krishna Sridhar, Erik Visser

In this paper, we investigate the use of the Natural Language Inference (NLI) entailment metric to detect and prevent hallucinations in summary generation.

Abstractive Text Summarization Hallucination +3

Paper
Add Code

Application of Knowledge Distillation to Multi-task Speech Representation Learning

no code implementations • 29 Oct 2022 • Mine Kerpicci, Van Nguyen, Shuhua Zhang, Erik Visser

Model architectures such as wav2vec 2. 0 and HuBERT have been proposed to learn speech representations from audio waveforms in a self-supervised manner.

Keyword Spotting Knowledge Distillation +4

Paper
Add Code

Activity report analysis with automatic single or multispan answer extraction

no code implementations • 9 Sep 2022 • Ravi Choudhary, Arvind Krishna Sridhar, Erik Visser

Depending on the context and the type of question asked, a question answering (QA) system would need to automatically determine whether the answer covers single-span or multi-span text components.

Anomaly Detection Question Answering

Paper
Add Code

Multi-task Voice Activated Framework using Self-supervised Learning

no code implementations • 3 Oct 2021 • Shehzeen Hussain, Van Nguyen, Shuhua Zhang, Erik Visser

Finally, we extend our framework to perform multi-task learning by jointly optimizing the network parameters on multiple voice activated tasks using a shared transformer backbone.

Ranked #6 on Speaker Verification on VoxCeleb

Emotion Classification Keyword Spotting +5

Paper
Add Code

Incremental Learning Algorithm for Sound Event Detection

no code implementations • 26 Mar 2020 • Eunjeong Koh, Fatemeh Saki, Yinyi Guo, Cheng-Yu Hung, Erik Visser

The neural adapter layer facilitates the target model to learn new sound events with minimal training data and maintaining the performance of the previously learned sound events similar to the source model.

Event Detection Incremental Learning +1

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.