no code implementations • LREC 2022 • Emma Barker, Jon Barker, Robert Gaizauskas, Ning Ma, Monica Lestari Paramita
We present SNuC, the first published corpus of spoken alphanumeric identifiers of the sort typically used as serial and part numbers in the manufacturing sector.
no code implementations • 3 May 2022 • Jisi Zhang, Catalin Zorila, Rama Doddipatla, Jon Barker
In this paper, we explore an improved framework to train a monoaural neural enhancement model for robust speech recognition.
no code implementations • 8 Apr 2022 • Zehai Tu, Jack Deadman, Ning Ma, Jon Barker
End-to-end models have achieved significant improvement on automatic speech recognition.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+2
1 code implementation • 8 Apr 2022 • Zehai Tu, Ning Ma, Jon Barker
An accurate objective speech intelligibility prediction algorithms is of great interest for many applications such as speech enhancement for hearing aids.
1 code implementation • 8 Apr 2022 • Zehai Tu, Ning Ma, Jon Barker
Non-intrusive intelligibility prediction is important for its application in realistic scenarios, where a clean reference signal is difficult to access.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+1
no code implementations • 31 Jan 2022 • Max Ehrlich, Jon Barker, Namitha Padmanabhan, Larry Davis, Andrew Tao, Bryan Catanzaro, Abhinav Shrivastava
Video compression is a central feature of the modern internet powering technologies from social media to video conferencing.
no code implementations • 15 Jun 2021 • Jisi Zhang, Catalin Zorila, Rama Doddipatla, Jon Barker
The proposed method first uses mixtures of unseparated sources and the mixture invariant training (MixIT) criterion to train a teacher model.
no code implementations • 15 Mar 2021 • Zehai Tu, Ning Ma, Jon Barker
In this paper, we explore an alternative approach to finding the optimal fitting by introducing a hearing aid speech processing framework, in which the fitting is optimised in an automated way using an intelligibility objective function based on the HASPI physiological auditory model.
no code implementations • 20 Feb 2021 • Gerardo Roa Dabike, Jon Barker
In this paper, we ask whether vocal source features (pitch, shimmer, jitter, etc) can improve the performance of automatic sung speech recognition, arguing that conclusions previously drawn from spoken speech studies may not be valid in the sung speech domain.
no code implementations • 7 Feb 2021 • Jisi Zhang, Catalin Zorila, Rama Doddipatla, Jon Barker
In this paper, we present a novel multi-channel speech extraction system to simultaneously extract multiple clean individual sources from a mixture in noisy and reverberant environments.
no code implementations • 11 Nov 2020 • Jisi Zhang, Catalin Zorila, Rama Doddipatla, Jon Barker
To reduce the influence of reverberation on spatial feature extraction, a dereverberation pre-processing method has been applied to further improve the separation performance.
no code implementations • 20 Apr 2020 • Shinji Watanabe, Michael Mandel, Jon Barker, Emmanuel Vincent, Ashish Arora, Xuankai Chang, Sanjeev Khudanpur, Vimal Manohar, Daniel Povey, Desh Raj, David Snyder, Aswin Shanmugam Subramanian, Jan Trmal, Bar Ben Yair, Christoph Boeddeker, Zhaoheng Ni, Yusuke Fujita, Shota Horiguchi, Naoyuki Kanda, Takuya Yoshioka, Neville Ryant
Following the success of the 1st, 2nd, 3rd, 4th and 5th CHiME challenges we organize the 6th CHiME Speech Separation and Recognition Challenge (CHiME-6).
1 code implementation • 2 Nov 2018 • Fitsum A. Reda, Guilin Liu, Kevin J. Shih, Robert Kirby, Jon Barker, David Tarjan, Andrew Tao, Bryan Catanzaro
We present an approach for high-resolution video frame prediction by conditioning on both past frames and past optical flows.
1 code implementation • ECCV 2018 • Fitsum A. Reda, Guilin Liu, Kevin J. Shih, Robert Kirby, Jon Barker, David Tarjan, Andrew Tao, Bryan Catanzaro
We present an approach for high-resolution video frame prediction by conditioning on both past frames and past optical flows.
Ranked #1 on
Video Prediction
on YouTube-8M
no code implementations • 31 Jul 2018 • Mandar Gogate, Ahsan Adeel, Ricard Marxer, Jon Barker, Amir Hussain
The process of selective attention in the brain is known to contextually exploit the available audio and visual cues to better focus on target speaker while filtering out other noises.
no code implementations • 28 Mar 2018 • Jon Barker, Shinji Watanabe, Emmanuel Vincent, Jan Trmal
The CHiME challenge series aims to advance robust automatic speech recognition (ASR) technology by promoting research at the interface of speech and language processing, signal processing , and machine learning.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+4
7 code implementations • 25 Oct 2017 • Edward Raff, Jon Barker, Jared Sylvester, Robert Brandon, Bryan Catanzaro, Charles Nicholas
In this work we introduce malware detection from raw byte sequences as a fruitful research area to the larger machine learning community.