Search Results for author: Ozlem Kalinli

Found 14 papers, 0 papers with code

Scaling ASR Improves Zero and Few Shot Learning

no code implementations10 Nov 2021 Alex Xiao, Weiyi Zheng, Gil Keren, Duc Le, Frank Zhang, Christian Fuegen, Ozlem Kalinli, Yatharth Saraf, Abdelrahman Mohamed

With 4. 5 million hours of English speech from 10 different sources across 120 countries and models of up to 10 billion parameters, we explore the frontiers of scale for automatic speech recognition.

automatic-speech-recognition Few-Shot Learning +1

Omni-sparsity DNN: Fast Sparsity Optimization for On-Device Streaming E2E ASR via Supernet

no code implementations15 Oct 2021 Haichuan Yang, Yuan Shangguan, Dilin Wang, Meng Li, Pierce Chuang, Xiaohui Zhang, Ganesh Venkatesh, Ozlem Kalinli, Vikas Chandra

From wearables to powerful smart devices, modern automatic speech recognition (ASR) models run on a variety of edge devices with different computational budgets.

automatic-speech-recognition Fine-tuning +1

Transferring Voice Knowledge for Acoustic Event Detection: An Empirical Study

no code implementations7 Oct 2021 Dawei Liang, Yangyang Shi, Yun Wang, Nayan Singhal, Alex Xiao, Jonathan Shaw, Edison Thomaz, Ozlem Kalinli, Mike Seltzer

Detection of common events and scenes from audio is useful for extracting and understanding human contexts in daily life.

Event Detection

Collaborative Training of Acoustic Encoders for Speech Recognition

no code implementations16 Jun 2021 Varun Nagaraja, Yangyang Shi, Ganesh Venkatesh, Ozlem Kalinli, Michael L. Seltzer, Vikas Chandra

On-device speech recognition requires training models of different sizes for deploying on devices with various computational budgets.

Speech Recognition

Dissecting User-Perceived Latency of On-Device E2E Speech Recognition

no code implementations6 Apr 2021 Yuan Shangguan, Rohit Prabhavalkar, Hang Su, Jay Mahadeokar, Yangyang Shi, Jiatong Zhou, Chunyang Wu, Duc Le, Ozlem Kalinli, Christian Fuegen, Michael L. Seltzer

As speech-enabled devices such as smartphones and smart speakers become increasingly ubiquitous, there is growing interest in building automatic speech recognition (ASR) systems that can run directly on-device; end-to-end (E2E) speech recognition models such as recurrent neural network transducers and their variants have recently emerged as prime candidates for this task.

automatic-speech-recognition Speech Recognition

Dynamic Encoder Transducer: A Flexible Solution For Trading Off Accuracy For Latency

no code implementations5 Apr 2021 Yangyang Shi, Varun Nagaraja, Chunyang Wu, Jay Mahadeokar, Duc Le, Rohit Prabhavalkar, Alex Xiao, Ching-Feng Yeh, Julian Chan, Christian Fuegen, Ozlem Kalinli, Michael L. Seltzer

DET gets similar accuracy as a baseline model with better latency on a large in-house data set by assigning a lightweight encoder for the beginning part of one utterance and a full-size encoder for the rest.

Speech Recognition

Bandwidth Embeddings for Mixed-bandwidth Speech Recognition

no code implementations5 Sep 2019 Gautam Mantena, Ozlem Kalinli, Ossama Abdel-hamid, Don McAllaster

In this paper, we tackle the problem of handling narrowband and wideband speech by building a single acoustic model (AM), also called mixed bandwidth AM.

Speech Recognition

Cannot find the paper you are looking for? You can Submit a new open access paper.