no code implementations • 23 Sep 2024 • Junyi Peng, Ladislav Mošner, Lin Zhang, Oldřich Plchot, Themos Stafylakis, Lukáš Burget, Jan Černocký
Self-supervised learning (SSL) models for speaker verification (SV) have gained significant attention in recent years.
no code implementations • 14 Sep 2024 • Alexander Polok, Dominik Klement, Matthew Wiesner, Sanjeev Khudanpur, Jan Černocký, Lukáš Burget
We propose a novel approach to enable the use of large, single speaker ASR models, such as Whisper, for target speaker ASR.
no code implementations • 12 Mar 2024 • Jan Pešán, Santosh Kesiraju, Lukáš Burget, Jan ''Honza'' Černocký
This paper critically evaluates the prevalent assumption that machine learning models trained on such datasets genuinely learn to identify paralinguistic traits, rather than merely capturing lexical features.
1 code implementation • 7 Dec 2023 • Federico Landini, Mireia Diez, Themos Stafylakis, Lukáš Burget
Until recently, the field of speaker diarization was dominated by cascaded systems.
1 code implementation • 4 Oct 2023 • Dominik Klement, Mireia Diez, Federico Landini, Lukáš Burget, Anna Silnova, Marc Delcroix, Naohiro Tawara
Bayesian HMM clustering of x-vector sequences (VBx) has become a widely adopted diarization baseline model in publications and challenges.
no code implementations • 21 May 2023 • Karel Beneš, Martin Kocour, Lukáš Burget
Furthermore, we show that utilizing Hystoc in fusion of multiple e2e ASR systems increases the gains from the fusion by up to 1\,\% WER absolute on Spanish RTVE2020 dataset.
no code implementations • 17 May 2023 • Junyi Peng, Oldřich Plchot, Themos Stafylakis, Ladislav Mošner, Lukáš Burget, Jan Černocký
Recently, fine-tuning large pre-trained Transformer models using downstream datasets has received a rising interest.
3 code implementations • 12 Nov 2022 • Federico Landini, Mireia Diez, Alicia Lozano-Diez, Lukáš Burget
End-to-end diarization presents an attractive alternative to standard cascaded diarization systems because a single system can handle all aspects of the task at once.
no code implementations • 28 Oct 2022 • Junyi Peng, Themos Stafylakis, Rongzhi Gu, Oldřich Plchot, Ladislav Mošner, Lukáš Burget, Jan Černocký
Recently, the pre-trained Transformer models have received a rising interest in the field of speech processing thanks to their great success in various downstream tasks.
2 code implementations • 27 Oct 2022 • Anna Silnova, Niko Brümmer, Albert Swart, Lukáš Burget
It extends PSDA with the ability to model within and between-speaker variabilities in toroidal submanifolds of the hypersphere.
no code implementations • 2 Apr 2022 • Murali Karthick Baskar, Tim Herzig, Diana Nguyen, Mireia Diez, Tim Polzehl, Lukáš Burget, Jan "Honza'' Černocký
Speaker adaptation using fMLLR and xvectors have provided major gains for dysarthric speech with very little adaptation data.
2 code implementations • 2 Apr 2022 • Federico Landini, Alicia Lozano-Diez, Mireia Diez, Lukáš Burget
However, simulated mixtures do not resemble real conversations in many aspects.
no code implementations • 29 Mar 2022 • Themos Stafylakis, Ladislav Mošner, Oldřich Plchot, Johan Rohdin, Anna Silnova, Lukáš Burget, Jan "Honza'' Černocký
In this paper, we demonstrate a method for training speaker embedding extractors using weak annotation.
3 code implementations • 28 Mar 2022 • Niko Brümmer, Albert Swart, Ladislav Mošner, Anna Silnova, Oldřich Plchot, Themos Stafylakis, Lukáš Burget
In speaker recognition, where speech segments are mapped to embeddings on the unit hypersphere, two scoring backends are commonly used, namely cosine scoring or PLDA.
1 code implementation • 11 Nov 2021 • Ladislav Mošner, Oldřich Plchot, Lukáš Burget, Jan Černocký
Motivated by unconsolidated data situation and the lack of a standard benchmark in the field, we complement our previous efforts and present a comprehensive corpus designed for training and evaluating text-independent multi-channel speaker verification systems.
1 code implementation • 31 Oct 2021 • Martin Kocour, Kateřina Žmolíková, Lucas Ondel, Ján Švec, Marc Delcroix, Tsubasa Ochiai, Lukáš Burget, Jan Černocký
We modify the acoustic model to predict joint state posteriors for all speakers, enabling the network to express uncertainty about the attribution of parts of the speech signal to the speakers.
no code implementations • 22 Oct 2021 • Lucas Ondel, Léa-Marie Lam-Yee-Mui, Martin Kocour, Caio Filippo Corro, Lukáš Burget
We propose to express the forward-backward algorithm in terms of operations between sparse matrices in a specific semiring.
1 code implementation • 13 Apr 2021 • Murali Karthick Baskar, Lukáš Burget, Shinji Watanabe, Ramon Fernandez Astudillo, Jan "Honza'' Černocký
Self-supervised ASR-TTS models suffer in out-of-domain data conditions.
1 code implementation • 24 Nov 2020 • Katerina Zmolikova, Marc Delcroix, Lukáš Burget, Tomohiro Nakatani, Jan "Honza" Černocký
In this paper, we propose a method combining variational autoencoder model of speech with a spatial clustering approach for multi-channel speech separation.
Audio and Speech Processing
1 code implementation • 11 Nov 2020 • Karel Beneš, Lukáš Burget
We examine the effect of data augmentation for training of language models for speech recognition.
2 code implementations • 2 Jul 2020 • Santosh Kesiraju, Sangeet Sagar, Ondřej Glembek, Lukáš Burget, Ján Černocký, Suryakanth V Gangashetty
In this paper, we present a Bayesian multilingual document model for learning language-independent document embeddings.
no code implementations • 10 Jun 2020 • Matej Kosiba, Maggie Lieu, Bruno Altieri, Nicolas Clerc, Lorenzo Faccioli, Sarah Kendrew, Ivan Valtchanov, Tatyana Sadibekova, Marguerite Pierre, Filip Hroch, Norbert Werner, Lukáš Burget, Christian Garrel, Elias Koulouridis, Evelina Gaynullina, Mona Molham, Miriam E. Ramos-Ceja, Alina Khalikova
The results of using CNNs on combined X-ray and optical data for galaxy cluster candidate classification are encouraging and there is a lot of potential for future usage and improvements.
Cosmology and Nongalactic Astrophysics High Energy Astrophysical Phenomena Instrumentation and Methods for Astrophysics
1 code implementation • 6 Apr 2020 • Anna Silnova, Niko Brümmer, Johan Rohdin, Themos Stafylakis, Lukáš Burget
We apply the proposed probabilistic embeddings as input to an agglomerative hierarchical clustering (AHC) algorithm to do diarization in the DIHARD'19 evaluation set.
no code implementations • 8 Dec 2019 • Hossein Zeinali, Lukáš Burget, Jan "Honza'' Černocký
We also provide the results of several experiments that can be considered as baselines: HMM-based i-vectors for text-dependent speaker verification, and HMM-based as well as state-of-the-art deep neural network based ASR.
1 code implementation • 19 Oct 2019 • Federico Landini, Shuai Wang, Mireia Diez, Lukáš Burget, Pavel Matějka, Kateřina Žmolíková, Ladislav Mošner, Oldřich Plchot, Ondřej Novotný, Hossein Zeinali, Johan Rohdin
This paper describes the systems developed by the BUT team for the four tracks of the second DIHARD speech diarization challenge.
2 code implementations • 20 Aug 2019 • Santosh Kesiraju, Oldřich Plchot, Lukáš Burget, Suryakanth V. Gangashetty
We present Bayesian subspace multinomial model (Bayesian SMM), a generative log-linear model that learns to represent documents in the form of Gaussian distributions, thereby encoding the uncertainty in its co-variance.
Ranked #1 on Topic Models on 20 Newsgroups
no code implementations • 13 Jul 2019 • Hossein Zeinali, Themos Stafylakis, Georgia Athanasopoulou, Johan Rohdin, Ioannis Gkinis, Lukáš Burget, Jan "Honza'' Černocký
In this paper, we present the system description of the joint efforts of Brno University of Technology (BUT) and Omilia -- Conversational Intelligence for the ASVSpoof2019 Spoofing and Countermeasures Challenge.
no code implementations • 13 Jul 2019 • Hossein Zeinali, Pavel Matějka, Ladislav Mošner, Oldřich Plchot, Anna Silnova, Ondřej Novotný, Ján Profant, Ondřej Glembek, Lukáš Burget
This is a description of our effort in VOiCES 2019 Speaker Recognition challenge.
no code implementations • 30 Apr 2019 • Murali Karthick Baskar, Shinji Watanabe, Ramon Astudillo, Takaaki Hori, Lukáš Burget, Jan Černocký
Such techniques derive training procedures and losses able to leverage unpaired speech and/or text data by combining ASR with Text-to-Speech (TTS) models.
Ranked #34 on Semi-Supervised Image Classification on ImageNet - 10% labeled data (Top 5 Accuracy metric)
Automatic Speech Recognition Automatic Speech Recognition (ASR) +3
1 code implementation • 8 Apr 2019 • Lucas Ondel, Hari Krishna Vydana, Lukáš Burget, Jan Černocký
This work tackles the problem of learning a set of language specific acoustic units from unlabeled speech recordings given a set of labeled recordings from other languages.
1 code implementation • SEMEVAL 2019 • Martin Fajcik, Lukáš Burget, Pavel Smrz
This paper describes our system submitted to SemEval 2019 Task 7: RumourEval 2019: Determining Rumour Veracity and Support for Rumours, Subtask A (Gorrell et al., 2019).
no code implementations • 7 Nov 2018 • Murali Karthick Baskar, Lukáš Burget, Shinji Watanabe, Martin Karafiát, Takaaki Hori, Jan Honza Černocký
In this paper, we present promising accurate prefix boosting (PAPB), a discriminative training technique for attention based sequence-to-sequence (seq2seq) ASR.
no code implementations • 16 Oct 2014 • Igor Szöke, Miroslav Skácel, Lukáš Burget
The primary system we submitted was composed of 11 subsystems as the required run.
Ranked #1 on Keyword Spotting on QUESST (MinCnxe metric)