Search Results for author: Daniel Galvez

Found 6 papers, 2 papers with code

GPU-Accelerated WFST Beam Search Decoder for CTC-based Speech Recognition

1 code implementation8 Nov 2023 Daniel Galvez, Tim Kaldewey

While Connectionist Temporal Classification (CTC) models deliver state-of-the-art accuracy in automated speech recognition (ASR) pipelines, their performance has been limited by CPU-based beam search decoding.

speech-recognition Speech Recognition

Speech Wikimedia: A 77 Language Multilingual Speech Dataset

1 code implementation30 Aug 2023 Rafael Mosquera Gómez, Julián Eusse, Juan Ciro, Daniel Galvez, Ryan Hileman, Kurt Bollacker, David Kanter

The Speech Wikimedia Dataset is a publicly available compilation of audio with transcriptions extracted from Wikimedia Commons.

Machine Translation speech-recognition +2

LSH methods for data deduplication in a Wikipedia artificial dataset

no code implementations10 Dec 2021 Juan Ciro, Daniel Galvez, Tim Schlippe, David Kanter

This paper illustrates locality sensitive hasing (LSH) models for the identification and removal of nearly redundant data in a text dataset.

The People's Speech: A Large-Scale Diverse English Speech Recognition Dataset for Commercial Usage

no code implementations17 Nov 2021 Daniel Galvez, Greg Diamos, Juan Ciro, Juan Felipe Cerón, Keith Achorn, Anjali Gopi, David Kanter, Maximilian Lam, Mark Mazumder, Vijay Janapa Reddi

The People's Speech is a free-to-download 30, 000-hour and growing supervised conversational English speech recognition dataset licensed for academic and commercial usage under CC-BY-SA (with a CC-BY subset).

speech-recognition Speech Recognition

Multiple-Instance, Cascaded Classification for Keyword Spotting in Narrow-Band Audio

no code implementations21 Nov 2017 Ahmad AbdulKader, Kareem Nassar, Mohamed Mahmoud, Daniel Galvez, Chetan Patil

We propose using cascaded classifiers for a keyword spotting (KWS) task on narrow-band (NB), 8kHz audio acquired in non-IID environments --- a more challenging task than most state-of-the-art KWS systems face.

General Classification Keyword Spotting +1

Purely sequence-trained neural networks for ASR based on lattice-free MMI

no code implementations INTERSPEECH 2016 2016 Daniel Povey, Vijayaditya Peddinti, Daniel Galvez, Pegah Ghahrmani, Vimal Manohar, Xingyu Na, Yiming Wang, Sanjeev Khudanpur

Models trained with LFMMI provide a relative word error rate reduction of ∼11. 5%, over those trained with cross-entropy objective function, and ∼8%, over those trained with cross-entropy and sMBR objective functions.

Language Modelling Speech Recognition

Cannot find the paper you are looking for? You can Submit a new open access paper.