Search Results for author: Pavel Denisov

Found 17 papers, 7 papers with code

Teaching a Multilingual Large Language Model to Understand Multilingual Speech via Multi-Instructional Training

1 code implementation16 Apr 2024 Pavel Denisov, Ngoc Thang Vu

Our zero-shot evaluation results confirm the robustness of our approach across multiple tasks, including speech translation and multilingual spoken language understanding, thereby opening new avenues for applying LLMs in the speech domain.

Language Modelling Large Language Model +3

The IMS Toucan System for the Blizzard Challenge 2023

1 code implementation26 Oct 2023 Florian Lux, Julia Koch, Sarina Meyer, Thomas Bott, Nadja Schauffler, Pavel Denisov, Antje Schweitzer, Ngoc Thang Vu

For our contribution to the Blizzard Challenge 2023, we improved on the system we submitted to the Blizzard Challenge 2021.

Leveraging Multilingual Self-Supervised Pretrained Models for Sequence-to-Sequence End-to-End Spoken Language Understanding

1 code implementation9 Oct 2023 Pavel Denisov, Ngoc Thang Vu

A number of methods have been proposed for End-to-End Spoken Language Understanding (E2E-SLU) using pretrained models, however their evaluation often lacks multilingual setup and tasks that require prediction of lexical fillers, such as slot filling.

slot-filling Slot Filling +3

Anonymizing Speech with Generative Adversarial Networks to Preserve Speaker Privacy

1 code implementation13 Oct 2022 Sarina Meyer, Pascal Tilli, Pavel Denisov, Florian Lux, Julia Koch, Ngoc Thang Vu

In order to protect the privacy of speech data, speaker anonymization aims for hiding the identity of a speaker by changing the voice in speech recordings.

Generative Adversarial Network

Speaker Anonymization with Phonetic Intermediate Representations

1 code implementation11 Jul 2022 Sarina Meyer, Florian Lux, Pavel Denisov, Julia Koch, Pascal Tilli, Ngoc Thang Vu

In this work, we propose a speaker anonymization pipeline that leverages high quality automatic speech recognition and synthesis systems to generate speech conditioned on phonetic transcriptions and anonymized speaker embeddings.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

ESPnet-SLU: Advancing Spoken Language Understanding through ESPnet

2 code implementations29 Nov 2021 Siddhant Arora, Siddharth Dalmia, Pavel Denisov, Xuankai Chang, Yushi Ueda, Yifan Peng, Yuekai Zhang, Sujay Kumar, Karthik Ganesan, Brian Yan, Ngoc Thang Vu, Alan W Black, Shinji Watanabe

However, there are few open source toolkits that can be used to generate reproducible results on different Spoken Language Understanding (SLU) benchmarks.

Spoken Language Understanding

Pretrained Semantic Speech Embeddings for End-to-End Spoken Language Understanding via Cross-Modal Teacher-Student Learning

no code implementations3 Jul 2020 Pavel Denisov, Ngoc Thang Vu

Spoken language understanding is typically based on pipeline architectures including speech recognition and natural language understanding steps.

Natural Language Understanding speech-recognition +2

ADVISER: A Toolkit for Developing Multi-modal, Multi-domain and Socially-engaged Conversational Agents

1 code implementation ACL 2020 Chia-Yu Li, Daniel Ortega, Dirk Väth, Florian Lux, Lindsey Vanderlyn, Maximilian Schmidt, Michael Neumann, Moritz Völkel, Pavel Denisov, Sabrina Jenne, Zorica Kacarevic, Ngoc Thang Vu

We present ADVISER - an open-source, multi-domain dialog system toolkit that enables the development of multi-modal (incorporating speech, text and vision), socially-engaged (e. g. emotion recognition, engagement level prediction and backchanneling) conversational agents.

BIG-bench Machine Learning Emotion Recognition

IMS-Speech: A Speech to Text Tool

no code implementations13 Aug 2019 Pavel Denisov, Ngoc Thang Vu

We present the IMS-Speech, a web based tool for German and English speech transcription aiming to facilitate research in various disciplines which require accesses to lexical information in spoken language materials.

Ranked #4 on Speech Recognition on TUDA (using extra training data)

speech-recognition Speech Recognition

Unsupervised Domain Adaptation by Adversarial Learning for Robust Speech Recognition

no code implementations30 Jul 2018 Pavel Denisov, Ngoc Thang Vu, Marc Ferras Font

In this paper, we investigate the use of adversarial learning for unsupervised adaptation to unseen recording conditions, more specifically, single microphone far-field speech.

Robust Speech Recognition speech-recognition +1

Cannot find the paper you are looking for? You can Submit a new open access paper.