Search Results for author: Jon Barker

Found 21 papers, 6 papers with code

SNuC: The Sheffield Numbers Spoken Language Corpus

no code implementations • LREC 2022 • Emma Barker, Jon Barker, Robert Gaizauskas, Ning Ma, Monica Lestari Paramita

We present SNuC, the first published corpus of spoken alphanumeric identifiers of the sort typically used as serial and part numbers in the manufacturing sector.

Paper
Add Code

Non-Intrusive Speech Intelligibility Prediction for Hearing-Impaired Users using Intermediate ASR Features and Human Memory Models

no code implementations • 24 Jan 2024 • Rhiannon Mogridge, George Close, Robert Sutherland, Thomas Hain, Jon Barker, Stefan Goetze, Anton Ragni

Neural networks have been successfully used for non-intrusive speech intelligibility prediction.

Paper
Add Code

Intelligibility prediction with a pretrained noise-robust automatic speech recognition model

no code implementations • 20 Oct 2023 • Zehai Tu, Ning Ma, Jon Barker

This paper describes two intelligibility prediction systems derived from a pretrained noise-robust automatic speech recognition (ASR) model for the second Clarity Prediction Challenge (CPC2).

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Paper
Add Code

The First Cadenza Signal Processing Challenge: Improving Music for Those With a Hearing Loss

1 code implementation • 9 Oct 2023 • Gerardo Roa Dabike, Scott Bannister, Jennifer Firth, Simone Graetzer, Rebecca Vos, Michael A. Akeroyd, Jon Barker, Trevor J. Cox, Bruno Fazenda, Alinka Greasley, William Whitmer

The Cadenza project aims to improve the audio quality of music for those who have a hearing loss.

Ranked #1 on Cadenza 1 - Task 2 - In Car on FMA

Cadenza 1 - Task 1 - Headphone Cadenza 1 - Task 2 - In Car

105

Paper
Code

The ICASSP SP Cadenza Challenge: Music Demixing/Remixing for Hearing Aids

no code implementations • 5 Oct 2023 • Gerardo Roa Dabike, Michael A. Akeroyd, Scott Bannister, Jon Barker, Trevor J. Cox, Bruno Fazenda, Jennifer Firth, Simone Graetzer, Alinka Greasley, Rebecca R. Vos, William M. Whitmer

This paper reports on the design and results of the 2024 ICASSP SP Cadenza Challenge: Music Demixing/Remixing for Hearing Aids.

Paper
Add Code

On monoaural speech enhancement for automatic recognition of real noisy speech using mixture invariant training

no code implementations • 3 May 2022 • Jisi Zhang, Catalin Zorila, Rama Doddipatla, Jon Barker

In this paper, we explore an improved framework to train a monoaural neural enhancement model for robust speech recognition.

Robust Speech Recognition Speech Enhancement +1

Paper
Add Code

Auditory-Based Data Augmentation for End-to-End Automatic Speech Recognition

no code implementations • 8 Apr 2022 • Zehai Tu, Jack Deadman, Ning Ma, Jon Barker

End-to-end models have achieved significant improvement on automatic speech recognition.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

Exploiting Hidden Representations from a DNN-based Speech Recogniser for Speech Intelligibility Prediction in Hearing-impaired Listeners

1 code implementation • 8 Apr 2022 • Zehai Tu, Ning Ma, Jon Barker

An accurate objective speech intelligibility prediction algorithms is of great interest for many applications such as speech enhancement for hearing aids.

Speech Enhancement speech-recognition +1

105

Paper
Code

Unsupervised Uncertainty Measures of Automatic Speech Recognition for Non-intrusive Speech Intelligibility Prediction

1 code implementation • 8 Apr 2022 • Zehai Tu, Ning Ma, Jon Barker

Non-intrusive intelligibility prediction is important for its application in realistic scenarios, where a clean reference signal is difficult to access.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

105

Paper
Code

Leveraging Bitstream Metadata for Fast, Accurate, Generalized Compressed Video Quality Enhancement

no code implementations • 31 Jan 2022 • Max Ehrlich, Jon Barker, Namitha Padmanabhan, Larry Davis, Andrew Tao, Bryan Catanzaro, Abhinav Shrivastava

Video compression is a central feature of the modern internet powering technologies from social media to video conferencing.

Quantization Video Compression

Paper
Add Code

Teacher-Student MixIT for Unsupervised and Semi-supervised Speech Separation

no code implementations • 15 Jun 2021 • Jisi Zhang, Catalin Zorila, Rama Doddipatla, Jon Barker

The proposed method first uses mixtures of unseparated sources and the mixture invariant training (MixIT) criterion to train a teacher model.

Speech Separation

Paper
Add Code

DHASP: Differentiable Hearing Aid Speech Processing

no code implementations • 15 Mar 2021 • Zehai Tu, Ning Ma, Jon Barker

In this paper, we explore an alternative approach to finding the optimal fitting by introducing a hearing aid speech processing framework, in which the fitting is optimised in an automated way using an intelligibility objective function based on the HASPI physiological auditory model.

Paper
Add Code

The Use of Voice Source Features for Sung Speech Recognition

no code implementations • 20 Feb 2021 • Gerardo Roa Dabike, Jon Barker

In this paper, we ask whether vocal source features (pitch, shimmer, jitter, etc) can improve the performance of automatic sung speech recognition, arguing that conclusions previously drawn from spoken speech studies may not be valid in the sung speech domain.

speech-recognition Speech Recognition +1

Paper
Add Code

Time-Domain Speech Extraction with Spatial Information and Multi Speaker Conditioning Mechanism

no code implementations • 7 Feb 2021 • Jisi Zhang, Catalin Zorila, Rama Doddipatla, Jon Barker

In this paper, we present a novel multi-channel speech extraction system to simultaneously extract multiple clean individual sources from a mixture in noisy and reverberant environments.

Speech Extraction speech-recognition +1

Paper
Add Code

On End-to-end Multi-channel Time Domain Speech Separation in Reverberant Environments

no code implementations • 11 Nov 2020 • Jisi Zhang, Catalin Zorila, Rama Doddipatla, Jon Barker

To reduce the influence of reverberation on spatial feature extraction, a dereverberation pre-processing method has been applied to further improve the separation performance.

speech-recognition Speech Recognition +1

Paper
Add Code

CHiME-6 Challenge:Tackling Multispeaker Speech Recognition for Unsegmented Recordings

no code implementations • 20 Apr 2020 • Shinji Watanabe, Michael Mandel, Jon Barker, Emmanuel Vincent, Ashish Arora, Xuankai Chang, Sanjeev Khudanpur, Vimal Manohar, Daniel Povey, Desh Raj, David Snyder, Aswin Shanmugam Subramanian, Jan Trmal, Bar Ben Yair, Christoph Boeddeker, Zhaoheng Ni, Yusuke Fujita, Shota Horiguchi, Naoyuki Kanda, Takuya Yoshioka, Neville Ryant

Following the success of the 1st, 2nd, 3rd, 4th and 5th CHiME challenges we organize the 6th CHiME Speech Separation and Recognition Challenge (CHiME-6).

speaker-diarization Speaker Diarization +4

Paper
Add Code

SDCNet: Video Prediction Using Spatially-Displaced Convolution

1 code implementation • 2 Nov 2018 • Fitsum A. Reda, Guilin Liu, Kevin J. Shih, Robert Kirby, Jon Barker, David Tarjan, Andrew Tao, Bryan Catanzaro

We present an approach for high-resolution video frame prediction by conditioning on both past frames and past optical flows.

Optical Flow Estimation SSIM +1

1,751

Paper
Code

SDC-Net: Video prediction using spatially-displaced convolution

1 code implementation • ECCV 2018 • Fitsum A. Reda, Guilin Liu, Kevin J. Shih, Robert Kirby, Jon Barker, David Tarjan, Andrew Tao, Bryan Catanzaro

We present an approach for high-resolution video frame prediction by conditioning on both past frames and past optical flows.

Ranked #1 on Video Prediction on YouTube-8M

Optical Flow Estimation SSIM +1

1,751

Paper
Code

DNN driven Speaker Independent Audio-Visual Mask Estimation for Speech Separation

no code implementations • 31 Jul 2018 • Mandar Gogate, Ahsan Adeel, Ricard Marxer, Jon Barker, Amir Hussain

The process of selective attention in the brain is known to contextually exploit the available audio and visual cues to better focus on target speaker while filtering out other noises.

Speech Separation

Paper
Add Code

The fifth 'CHiME' Speech Separation and Recognition Challenge: Dataset, task and baselines

no code implementations • 28 Mar 2018 • Jon Barker, Shinji Watanabe, Emmanuel Vincent, Jan Trmal

The CHiME challenge series aims to advance robust automatic speech recognition (ASR) technology by promoting research at the interface of speech and language processing, signal processing , and machine learning.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +4

Paper
Add Code

Malware Detection by Eating a Whole EXE

7 code implementations • 25 Oct 2017 • Edward Raff, Jon Barker, Jared Sylvester, Robert Brandon, Bryan Catanzaro, Charles Nicholas

In this work we introduce malware detection from raw byte sequences as a fruitful research area to the larger machine learning community.

Malware Detection

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.