Search Results for author: Daniel Galvez

Found 6 papers, 2 papers with code

GPU-Accelerated WFST Beam Search Decoder for CTC-based Speech Recognition

1 code implementation • 8 Nov 2023 • Daniel Galvez, Tim Kaldewey

While Connectionist Temporal Classification (CTC) models deliver state-of-the-art accuracy in automated speech recognition (ASR) pipelines, their performance has been limited by CPU-based beam search decoding.

speech-recognition Speech Recognition

Paper
Code

Speech Wikimedia: A 77 Language Multilingual Speech Dataset

1 code implementation • 30 Aug 2023 • Rafael Mosquera Gómez, Julián Eusse, Juan Ciro, Daniel Galvez, Ryan Hileman, Kurt Bollacker, David Kanter

The Speech Wikimedia Dataset is a publicly available compilation of audio with transcriptions extracted from Wikimedia Commons.

Machine Translation speech-recognition +2

Paper
Code

LSH methods for data deduplication in a Wikipedia artificial dataset

no code implementations • 10 Dec 2021 • Juan Ciro, Daniel Galvez, Tim Schlippe, David Kanter

This paper illustrates locality sensitive hasing (LSH) models for the identification and removal of nearly redundant data in a text dataset.

Paper
Add Code

The People's Speech: A Large-Scale Diverse English Speech Recognition Dataset for Commercial Usage

no code implementations • 17 Nov 2021 • Daniel Galvez, Greg Diamos, Juan Ciro, Juan Felipe Cerón, Keith Achorn, Anjali Gopi, David Kanter, Maximilian Lam, Mark Mazumder, Vijay Janapa Reddi

The People's Speech is a free-to-download 30, 000-hour and growing supervised conversational English speech recognition dataset licensed for academic and commercial usage under CC-BY-SA (with a CC-BY subset).

speech-recognition Speech Recognition

Paper
Add Code

Multiple-Instance, Cascaded Classification for Keyword Spotting in Narrow-Band Audio

no code implementations • 21 Nov 2017 • Ahmad AbdulKader, Kareem Nassar, Mohamed Mahmoud, Daniel Galvez, Chetan Patil

We propose using cascaded classifiers for a keyword spotting (KWS) task on narrow-band (NB), 8kHz audio acquired in non-IID environments --- a more challenging task than most state-of-the-art KWS systems face.

General Classification Keyword Spotting +1

Paper
Add Code

Purely sequence-trained neural networks for ASR based on lattice-free MMI

no code implementations • INTERSPEECH 2016 2016 • Daniel Povey, Vijayaditya Peddinti, Daniel Galvez, Pegah Ghahrmani, Vimal Manohar, Xingyu Na, Yiming Wang, Sanjeev Khudanpur

Models trained with LFMMI provide a relative word error rate reduction of ∼11. 5%, over those trained with cross-entropy objective function, and ∼8%, over those trained with cross-entropy and sMBR objective functions.

Ranked #4 on Speech Recognition on WSJ eval92

Language Modelling Speech Recognition

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.