Search Results for author: Pedro Moreno Mengibar

Found 7 papers, 0 papers with code

TransformerFAM: Feedback attention is working memory

no code implementations • 14 Apr 2024 • Dongseong Hwang, Weiran Wang, Zhuoyuan Huo, Khe Chai Sim, Pedro Moreno Mengibar

While Transformers have revolutionized deep learning, their quadratic attention complexity hinders their ability to process infinitely long inputs.

Paper
Add Code

Hierarchical Recurrent Adapters for Efficient Multi-Task Adaptation of Large Speech Models

no code implementations • 25 Mar 2024 • Tsendsuren Munkhdalai, Youzheng Chen, Khe Chai Sim, Fadi Biadsy, Tara Sainath, Pedro Moreno Mengibar

However, their per-task parameter overhead is considered still high when the number of downstream tasks to adapt for is large.

Automatic Speech Recognition speech-recognition +1

Paper
Add Code

Audio-AdapterFusion: A Task-ID-free Approach for Efficient and Non-Destructive Multi-task Speech Recognition

no code implementations • 17 Oct 2023 • Hillary Ngai, Rohan Agrawal, Neeraj Gaur, Ronny Huang, Parisa Haghani, Pedro Moreno Mengibar

Adapters are an efficient, composable alternative to full fine-tuning of pre-trained models and help scale the deployment of large ASR models to many tasks.

speech-recognition Speech Recognition

Paper
Add Code

Contextual Biasing with the Knuth-Morris-Pratt Matching Algorithm

no code implementations • 29 Sep 2023 • Weiran Wang, Zelin Wu, Diamantino Caseiro, Tsendsuren Munkhdalai, Khe Chai Sim, Pat Rondon, Golan Pundak, Gan Song, Rohit Prabhavalkar, Zhong Meng, Ding Zhao, Tara Sainath, Pedro Moreno Mengibar

Contextual biasing refers to the problem of biasing the automatic speech recognition (ASR) systems towards rare entities that are relevant to the specific user or application scenarios.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Paper
Add Code

Massive End-to-end Models for Short Search Queries

no code implementations • 22 Sep 2023 • Weiran Wang, Rohit Prabhavalkar, Dongseong Hwang, Qiujia Li, Khe Chai Sim, Bo Li, James Qin, Xingyu Cai, Adam Stooke, Zhong Meng, CJ Zheng, Yanzhang He, Tara Sainath, Pedro Moreno Mengibar

In this work, we investigate two popular end-to-end automatic speech recognition (ASR) models, namely Connectionist Temporal Classification (CTC) and RNN-Transducer (RNN-T), for offline recognition of voice search queries, with up to 2B model parameters.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Paper
Add Code

Improving Speech Recognition for African American English With Audio Classification

no code implementations • 16 Sep 2023 • Shefali Garg, Zhouyuan Huo, Khe Chai Sim, Suzan Schwartz, Mason Chua, Alëna Aksënova, Tsendsuren Munkhdalai, Levi King, Darryl Wright, Zion Mengesha, Dongseong Hwang, Tara Sainath, Françoise Beaufays, Pedro Moreno Mengibar

By combining the classifier output with coarse geographic information, we can select a subset of utterances from a large corpus of untranscribed short-form queries for semi-supervised learning at scale.

Audio Classification Automatic Speech Recognition +2

Paper
Add Code

Non-Parallel Voice Conversion for ASR Augmentation

no code implementations • 15 Sep 2022 • Gary Wang, Andrew Rosenberg, Bhuvana Ramabhadran, Fadi Biadsy, Yinghui Huang, Jesse Emond, Pedro Moreno Mengibar

For ASR augmentation, it is necessary that the VC model be robust to a wide range of input speech.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.