Search Results for author: Jan Trmal

Found 13 papers, 3 papers with code

GigaSpeech: An Evolving, Multi-domain ASR Corpus with 10,000 Hours of Transcribed Audio

2 code implementations • 13 Jun 2021 • Guoguo Chen, Shuzhou Chai, Guanbo Wang, Jiayu Du, Wei-Qiang Zhang, Chao Weng, Dan Su, Daniel Povey, Jan Trmal, Junbo Zhang, Mingjie Jin, Sanjeev Khudanpur, Shinji Watanabe, Shuaijiang Zhao, Wei Zou, Xiangang Li, Xuchen Yao, Yongqing Wang, Yujun Wang, Zhao You, Zhiyong Yan

This paper introduces GigaSpeech, an evolving, multi-domain English speech recognition corpus with 10, 000 hours of high quality labeled audio suitable for supervised training, and 40, 000 hours of total audio suitable for semi-supervised and unsupervised training.

Ranked #1 on Speech Recognition on GigaSpeech

Sentence speech-recognition +1

595

Paper
Code

Multi-task self-supervised learning for Robust Speech Recognition

1 code implementation • 25 Jan 2020 • Mirco Ravanelli, Jianyuan Zhong, Santiago Pascual, Pawel Swietojanski, Joao Monteiro, Jan Trmal, Yoshua Bengio

We then propose a revised encoder that better learns short- and long-term speech dynamics with an efficient combination of recurrent and convolutional networks.

Robust Speech Recognition Self-Supervised Learning +1

434

Paper
Code

Induced Inflection-Set Keyword Search in Speech

1 code implementation • WS 2020 • Oliver Adams, Matthew Wiesner, Jan Trmal, Garrett Nicolai, David Yarowsky

We investigate the problem of searching for a lexeme-set in speech by searching for its inflectional variants.

Paper
Code

The fifth 'CHiME' Speech Separation and Recognition Challenge: Dataset, task and baselines

no code implementations • 28 Mar 2018 • Jon Barker, Shinji Watanabe, Emmanuel Vincent, Jan Trmal

The CHiME challenge series aims to advance robust automatic speech recognition (ASR) technology by promoting research at the interface of speech and language processing, signal processing , and machine learning.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +4

Paper
Add Code

Automatic Speech Recognition and Topic Identification for Almost-Zero-Resource Languages

no code implementations • 23 Feb 2018 • Matthew Wiesner, Chunxi Liu, Lucas Ondel, Craig Harman, Vimal Manohar, Jan Trmal, Zhongqiang Huang, Najim Dehak, Sanjeev Khudanpur

Automatic speech recognition (ASR) systems often need to be developed for extremely low-resource languages to serve end-uses such as audio content categorization and search.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

Topic Identification for Speech without ASR

no code implementations • 22 Mar 2017 • Chunxi Liu, Jan Trmal, Matthew Wiesner, Craig Harman, Sanjeev Khudanpur

Modern topic identification (topic ID) systems for speech use automatic speech recognition (ASR) to produce speech transcripts, and perform supervised classification on such ASR outputs.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Paper
Add Code

Using of heterogeneous corpora for training of an ASR system

no code implementations • 1 Jun 2017 • Jan Trmal, Gaurav Kumar, Vimal Manohar, Sanjeev Khudanpur, Matt Post, Paul McNamee

The paper summarizes the development of the LVCSR system built as a part of the Pashto speech-translation system at the SCALE (Summer Camp for Applied Language Exploration) 2015 workshop on "Speech-to-text-translation for low-resource languages".

speech-recognition Speech Recognition +2

Paper
Add Code

Low-Resource Contextual Topic Identification on Speech

no code implementations • 17 Jul 2018 • Chunxi Liu, Matthew Wiesner, Shinji Watanabe, Craig Harman, Jan Trmal, Najim Dehak, Sanjeev Khudanpur

In topic identification (topic ID) on real-world unstructured audio, an audio instance of variable topic shifts is first broken into sequential segments, and each segment is independently classified.

General Classification Topic Classification +1

Paper
Add Code

A Coarse-Grained Model for Optimal Coupling of ASR and SMT Systems for Speech Translation

no code implementations • EMNLP 2015 • Gaurav Kumar, Graeme Blackwood, Jan Trmal, Daniel Povey, Sanjeev Khudanpur

Language Modelling Machine Translation +2

Paper
Add Code

New release of Mixer-6: Improved validity for phonetic study of speaker variation and identification

no code implementations • LREC 2016 • Eleanor Chodroff, Matthew Maciejewski, Jan Trmal, Sanjeev Khudanpur, John Godfrey

The Mixer series of speech corpora were collected over several years, principally to support annual NIST evaluations of speaker recognition (SR) technologies.

Speaker Recognition

Paper
Add Code

DiPCo -- Dinner Party Corpus

no code implementations • 30 Sep 2019 • Maarten Van Segbroeck, Ahmed Zaid, Ksenia Kutsenko, Cirenia Huerta, Tinh Nguyen, Xuewen Luo, Björn Hoffmeister, Jan Trmal, Maurizio Omologo, Roland Maas

We present a speech data corpus that simulates a "dinner party" scenario taking place in an everyday home environment.

Benchmarking

Paper
Add Code

CHiME-6 Challenge:Tackling Multispeaker Speech Recognition for Unsegmented Recordings

no code implementations • 20 Apr 2020 • Shinji Watanabe, Michael Mandel, Jon Barker, Emmanuel Vincent, Ashish Arora, Xuankai Chang, Sanjeev Khudanpur, Vimal Manohar, Daniel Povey, Desh Raj, David Snyder, Aswin Shanmugam Subramanian, Jan Trmal, Bar Ben Yair, Christoph Boeddeker, Zhaoheng Ni, Yusuke Fujita, Shota Horiguchi, Naoyuki Kanda, Takuya Yoshioka, Neville Ryant

Following the success of the 1st, 2nd, 3rd, 4th and 5th CHiME challenges we organize the 6th CHiME Speech Separation and Recognition Challenge (CHiME-6).

speaker-diarization Speaker Diarization +4

Paper
Add Code

Adversarial Attacks and Defenses for Speech Recognition Systems

no code implementations • 31 Mar 2021 • Piotr Żelasko, Sonal Joshi, Yiwen Shao, Jesus Villalba, Jan Trmal, Najim Dehak, Sanjeev Khudanpur

We investigate two threat models: a denial-of-service scenario where fast gradient-sign method (FGSM) or weak projected gradient descent (PGD) attacks are used to degrade the model's word error rate (WER); and a targeted scenario where a more potent imperceptible attack forces the system to recognize a specific phrase.

Adversarial Robustness Automatic Speech Recognition +2

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.