Search Results for author: Lukáš Burget

Found 33 papers, 17 papers with code

Target Speaker ASR with Whisper

no code implementations14 Sep 2024 Alexander Polok, Dominik Klement, Matthew Wiesner, Sanjeev Khudanpur, Jan Černocký, Lukáš Burget

We propose a novel approach to enable the use of large, single speaker ASR models, such as Whisper, for target speaker ASR.

Speech Separation

Beyond the Labels: Unveiling Text-Dependency in Paralinguistic Speech Recognition Datasets

no code implementations12 Mar 2024 Jan Pešán, Santosh Kesiraju, Lukáš Burget, Jan ''Honza'' Černocký

This paper critically evaluates the prevalent assumption that machine learning models trained on such datasets genuinely learn to identify paralinguistic traits, rather than merely capturing lexical features.

speech-recognition Speech Recognition

Discriminative Training of VBx Diarization

1 code implementation4 Oct 2023 Dominik Klement, Mireia Diez, Federico Landini, Lukáš Burget, Anna Silnova, Marc Delcroix, Naohiro Tawara

Bayesian HMM clustering of x-vector sequences (VBx) has become a widely adopted diarization baseline model in publications and challenges.

Bayesian Inference

Hystoc: Obtaining word confidences for fusion of end-to-end ASR systems

no code implementations21 May 2023 Karel Beneš, Martin Kocour, Lukáš Burget

Furthermore, we show that utilizing Hystoc in fusion of multiple e2e ASR systems increases the gains from the fusion by up to 1\,\% WER absolute on Spanish RTVE2020 dataset.

Automatic Speech Recognition speech-recognition +1

Improving Speaker Verification with Self-Pretrained Transformer Models

no code implementations17 May 2023 Junyi Peng, Oldřich Plchot, Themos Stafylakis, Ladislav Mošner, Lukáš Burget, Jan Černocký

Recently, fine-tuning large pre-trained Transformer models using downstream datasets has received a rising interest.

Speaker Verification

Multi-Speaker and Wide-Band Simulated Conversations as Training Data for End-to-End Neural Diarization

3 code implementations12 Nov 2022 Federico Landini, Mireia Diez, Alicia Lozano-Diez, Lukáš Burget

End-to-end diarization presents an attractive alternative to standard cascaded diarization systems because a single system can handle all aspects of the task at once.

Action Detection Activity Detection

Parameter-efficient transfer learning of pre-trained Transformer models for speaker verification using adapters

no code implementations28 Oct 2022 Junyi Peng, Themos Stafylakis, Rongzhi Gu, Oldřich Plchot, Ladislav Mošner, Lukáš Burget, Jan Černocký

Recently, the pre-trained Transformer models have received a rising interest in the field of speech processing thanks to their great success in various downstream tasks.

Speaker Verification Transfer Learning

Toroidal Probabilistic Spherical Discriminant Analysis

2 code implementations27 Oct 2022 Anna Silnova, Niko Brümmer, Albert Swart, Lukáš Burget

It extends PSDA with the ability to model within and between-speaker variabilities in toroidal submanifolds of the hypersphere.

Speaker Recognition

Speaker adaptation for Wav2vec2 based dysarthric ASR

no code implementations2 Apr 2022 Murali Karthick Baskar, Tim Herzig, Diana Nguyen, Mireia Diez, Tim Polzehl, Lukáš Burget, Jan "Honza'' Černocký

Speaker adaptation using fMLLR and xvectors have provided major gains for dysarthric speech with very little adaptation data.

speech-recognition Speech Recognition

Probabilistic Spherical Discriminant Analysis: An Alternative to PLDA for length-normalized embeddings

3 code implementations28 Mar 2022 Niko Brümmer, Albert Swart, Ladislav Mošner, Anna Silnova, Oldřich Plchot, Themos Stafylakis, Lukáš Burget

In speaker recognition, where speech segments are mapped to embeddings on the unit hypersphere, two scoring backends are commonly used, namely cosine scoring or PLDA.

Speaker Recognition

MultiSV: Dataset for Far-Field Multi-Channel Speaker Verification

1 code implementation11 Nov 2021 Ladislav Mošner, Oldřich Plchot, Lukáš Burget, Jan Černocký

Motivated by unconsolidated data situation and the lack of a standard benchmark in the field, we complement our previous efforts and present a comprehensive corpus designed for training and evaluating text-independent multi-channel speaker verification systems.

Denoising Speaker Verification +1

Revisiting joint decoding based multi-talker speech recognition with DNN acoustic model

1 code implementation31 Oct 2021 Martin Kocour, Kateřina Žmolíková, Lucas Ondel, Ján Švec, Marc Delcroix, Tsubasa Ochiai, Lukáš Burget, Jan Černocký

We modify the acoustic model to predict joint state posteriors for all speakers, enabling the network to express uncertainty about the attribution of parts of the speech signal to the speakers.

Decoder speech-recognition +1

GPU-Accelerated Forward-Backward algorithm with Application to Lattice-Free MMI

no code implementations22 Oct 2021 Lucas Ondel, Léa-Marie Lam-Yee-Mui, Martin Kocour, Caio Filippo Corro, Lukáš Burget

We propose to express the forward-backward algorithm in terms of operations between sparse matrices in a specific semiring.

Integration of variational autoencoder and spatial clustering for adaptive multi-channel neural speech separation

1 code implementation24 Nov 2020 Katerina Zmolikova, Marc Delcroix, Lukáš Burget, Tomohiro Nakatani, Jan "Honza" Černocký

In this paper, we propose a method combining variational autoencoder model of speech with a spatial clustering approach for multi-channel speech separation.

Audio and Speech Processing

Text Augmentation for Language Models in High Error Recognition Scenario

1 code implementation11 Nov 2020 Karel Beneš, Lukáš Burget

We examine the effect of data augmentation for training of language models for speech recognition.

speech-recognition Speech Recognition +2

Multiwavelength classification of X-ray selected galaxy cluster candidates using convolutional neural networks

no code implementations10 Jun 2020 Matej Kosiba, Maggie Lieu, Bruno Altieri, Nicolas Clerc, Lorenzo Faccioli, Sarah Kendrew, Ivan Valtchanov, Tatyana Sadibekova, Marguerite Pierre, Filip Hroch, Norbert Werner, Lukáš Burget, Christian Garrel, Elias Koulouridis, Evelina Gaynullina, Mona Molham, Miriam E. Ramos-Ceja, Alina Khalikova

The results of using CNNs on combined X-ray and optical data for galaxy cluster candidate classification are encouraging and there is a lot of potential for future usage and improvements.

Cosmology and Nongalactic Astrophysics High Energy Astrophysical Phenomena Instrumentation and Methods for Astrophysics

Probabilistic embeddings for speaker diarization

1 code implementation6 Apr 2020 Anna Silnova, Niko Brümmer, Johan Rohdin, Themos Stafylakis, Lukáš Burget

We apply the proposed probabilistic embeddings as input to an agglomerative hierarchical clustering (AHC) algorithm to do diarization in the DIHARD'19 evaluation set.

Clustering speaker-diarization +1

A Multi Purpose and Large Scale Speech Corpus in Persian and English for Speaker and Speech Recognition: the DeepMine Database

no code implementations8 Dec 2019 Hossein Zeinali, Lukáš Burget, Jan "Honza'' Černocký

We also provide the results of several experiments that can be considered as baselines: HMM-based i-vectors for text-dependent speaker verification, and HMM-based as well as state-of-the-art deep neural network based ASR.

speech-recognition Speech Recognition +2

Learning document embeddings along with their uncertainties

2 code implementations20 Aug 2019 Santosh Kesiraju, Oldřich Plchot, Lukáš Burget, Suryakanth V. Gangashetty

We present Bayesian subspace multinomial model (Bayesian SMM), a generative log-linear model that learns to represent documents in the form of Gaussian distributions, thereby encoding the uncertainty in its co-variance.

Topic Models Variational Inference

Detecting Spoofing Attacks Using VGG and SincNet: BUT-Omilia Submission to ASVspoof 2019 Challenge

no code implementations13 Jul 2019 Hossein Zeinali, Themos Stafylakis, Georgia Athanasopoulou, Johan Rohdin, Ioannis Gkinis, Lukáš Burget, Jan "Honza'' Černocký

In this paper, we present the system description of the joint efforts of Brno University of Technology (BUT) and Omilia -- Conversational Intelligence for the ASVSpoof2019 Spoofing and Countermeasures Challenge.

Bayesian Subspace Hidden Markov Model for Acoustic Unit Discovery

1 code implementation8 Apr 2019 Lucas Ondel, Hari Krishna Vydana, Lukáš Burget, Jan Černocký

This work tackles the problem of learning a set of language specific acoustic units from unlabeled speech recordings given a set of labeled recordings from other languages.

Acoustic Unit Discovery

BUT-FIT at SemEval-2019 Task 7: Determining the Rumour Stance with Pre-Trained Deep Bidirectional Transformers

1 code implementation SEMEVAL 2019 Martin Fajcik, Lukáš Burget, Pavel Smrz

This paper describes our system submitted to SemEval 2019 Task 7: RumourEval 2019: Determining Rumour Veracity and Support for Rumours, Subtask A (Gorrell et al., 2019).

General Classification Rumour Detection +1

Promising Accurate Prefix Boosting for sequence-to-sequence ASR

no code implementations7 Nov 2018 Murali Karthick Baskar, Lukáš Burget, Shinji Watanabe, Martin Karafiát, Takaaki Hori, Jan Honza Černocký

In this paper, we present promising accurate prefix boosting (PAPB), a discriminative training technique for attention based sequence-to-sequence (seq2seq) ASR.

BUT QUESST 2014 System Description

no code implementations16 Oct 2014 Igor Szöke, Miroslav Skácel, Lukáš Burget

The primary system we submitted was composed of 11 subsystems as the required run.

 Ranked #1 on Keyword Spotting on QUESST (MinCnxe metric)

Dynamic Time Warping Keyword Spotting +1

Cannot find the paper you are looking for? You can Submit a new open access paper.