no code implementations • 29 Feb 2024 • Takaaki Saeki, Gary Wang, Nobuyuki Morioka, Isaac Elias, Kyle Kastner, Fadi Biadsy, Andrew Rosenberg, Bhuvana Ramabhadran, Heiga Zen, Françoise Beaufays, Hadar Shemtov
Without any transcribed speech in a new language, this TTS model can generate intelligible speech in >30 unseen languages (CER difference of <10% to ground truth).
no code implementations • 16 Sep 2023 • Shefali Garg, Zhouyuan Huo, Khe Chai Sim, Suzan Schwartz, Mason Chua, Alëna Aksënova, Tsendsuren Munkhdalai, Levi King, Darryl Wright, Zion Mengesha, Dongseong Hwang, Tara Sainath, Françoise Beaufays, Pedro Moreno Mengibar
By combining the classifier output with coarse geographic information, we can select a subset of utterances from a large corpus of untranscribed short-form queries for semi-supervised learning at scale.
no code implementations • 31 Mar 2023 • Rami Botros, Rohit Prabhavalkar, Johan Schalkwyk, Ciprian Chelba, Tara N. Sainath, Françoise Beaufays
Overall, they present a modular, powerful and cheap alternative to the standard encoder output, as well as the N-best hypotheses.
no code implementations • 2 Mar 2023 • Yu Zhang, Wei Han, James Qin, Yongqiang Wang, Ankur Bapna, Zhehuai Chen, Nanxin Chen, Bo Li, Vera Axelrod, Gary Wang, Zhong Meng, Ke Hu, Andrew Rosenberg, Rohit Prabhavalkar, Daniel S. Park, Parisa Haghani, Jason Riesa, Ginger Perng, Hagen Soltau, Trevor Strohman, Bhuvana Ramabhadran, Tara Sainath, Pedro Moreno, Chung-Cheng Chiu, Johan Schalkwyk, Françoise Beaufays, Yonghui Wu
We introduce the Universal Speech Model (USM), a single large model that performs automatic speech recognition (ASR) across 100+ languages.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+3
no code implementations • 14 Sep 2022 • Rongmei Lin, Yonghui Xiao, Tien-Ju Yang, Ding Zhao, Li Xiong, Giovanni Motta, Françoise Beaufays
Automatic Speech Recognition models require large amount of speech data for training, and the collection of such data often leads to privacy concerns.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+2
no code implementations • 6 May 2022 • Tien-Ju Yang, Yonghui Xiao, Giovanni Motta, Françoise Beaufays, Rajiv Mathews, Mingqing Chen
This paper addresses the challenges of training large neural network models under federated learning settings: high on-device memory usage and communication cost.
no code implementations • 18 Apr 2022 • Ehsan Amid, Om Thakkar, Arun Narayanan, Rajiv Mathews, Françoise Beaufays
We design Noise Masking, a fill-in-the-blank style method for extracting targeted parts of training data from trained ASR models.
no code implementations • 11 Apr 2022 • Andrew Hard, Kurt Partridge, Neng Chen, Sean Augenstein, Aishanee Shah, Hyun Jin Park, Alex Park, Sara Ng, Jessica Nguyen, Ignacio Lopez Moreno, Rajiv Mathews, Françoise Beaufays
We trained a keyword spotting model using federated learning on real user devices and observed significant improvements when the model was deployed for inference on phones.
no code implementations • 17 Jan 2022 • Andreas Kabel, Keith Hall, Tom Ouyang, David Rybach, Daan van Esch, Françoise Beaufays
This paper proposes a framework to improve the typing experience of mobile users in morphologically rich languages.
1 code implementation • NeurIPS 2021 • Trung Dang, Om Thakkar, Swaroop Ramaswamy, Rajiv Mathews, Peter Chin, Françoise Beaufays
Prior works have demonstrated that labels can be revealed analytically from the last layer of certain models (e. g., ResNet), or they can be reconstructed jointly with model inputs by using Gradients Matching [Zhu et al'19] with additional knowledge about the current state of the model.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+4
no code implementations • 11 Oct 2021 • Tien-Ju Yang, Dhruv Guliani, Françoise Beaufays, Giovanni Motta
This paper aims to address the major challenges of Federated Learning (FL) on edge devices: limited memory and expensive communication.
no code implementations • 8 Oct 2021 • Lillian Zhou, Dhruv Guliani, Andreas Kabel, Giovanni Motta, Françoise Beaufays
Transformer-based architectures have been the subject of research aimed at understanding their overparameterization and the non-uniform importance of their layers.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+2
no code implementations • 5 Oct 2021 • Tsendsuren Munkhdalai, Khe Chai Sim, Angad Chandorkar, Fan Gao, Mason Chua, Trevor Strohman, Françoise Beaufays
Fast contextual adaptation has shown to be effective in improving Automatic Speech Recognition (ASR) of rare words and when combined with an on-device personalized training, it can yield an even better recognition result.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+4
no code implementations • 1 Oct 2021 • Zhouyuan Huo, Dongseong Hwang, Khe Chai Sim, Shefali Garg, Ananya Misra, Nikhil Siddhartha, Trevor Strohman, Françoise Beaufays
These models are typically trained on the server using transcribed speech data.
no code implementations • 1 Oct 2021 • Dongseong Hwang, Ananya Misra, Zhouyuan Huo, Nikhil Siddhartha, Shefali Garg, David Qiu, Khe Chai Sim, Trevor Strohman, Françoise Beaufays, Yanzhang He
Self- and semi-supervised learning methods have been actively investigated to reduce labeled training data or enhance the model performance.
no code implementations • 27 Sep 2021 • Yu Zhang, Daniel S. Park, Wei Han, James Qin, Anmol Gulati, Joel Shor, Aren Jansen, Yuanzhong Xu, Yanping Huang, Shibo Wang, Zongwei Zhou, Bo Li, Min Ma, William Chan, Jiahui Yu, Yongqiang Wang, Liangliang Cao, Khe Chai Sim, Bhuvana Ramabhadran, Tara N. Sainath, Françoise Beaufays, Zhifeng Chen, Quoc V. Le, Chung-Cheng Chiu, Ruoming Pang, Yonghui Wu
We summarize the results of a host of efforts using giant automatic speech recognition (ASR) models pre-trained using large, diverse unlabeled datasets containing approximately a million hours of audio.
Ranked #1 on
Speech Recognition
on Common Voice
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+3
no code implementations • 18 Jun 2021 • Katrin Tomanek, Françoise Beaufays, Julie Cattiau, Angad Chandorkar, Khe Chai Sim
While current state-of-the-art Automatic Speech Recognition (ASR) systems achieve high accuracy on typical speech, they suffer from significant performance degradation on disordered speech and other atypical speech patterns.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+1
1 code implementation • 15 Apr 2021 • Trung Dang, Om Thakkar, Swaroop Ramaswamy, Rajiv Mathews, Peter Chin, Françoise Beaufays
We show that a dropout rate of 0. 2 can reduce the speaker identity accuracy to 0% top-1 (0. 5% top-5).
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+2
no code implementations • 21 Sep 2020 • Swaroop Ramaswamy, Om Thakkar, Rajiv Mathews, Galen Andrew, H. Brendan McMahan, Françoise Beaufays
This paper presents the first consumer-scale next-word prediction (NWP) model trained with Federated Learning (FL) while leveraging the Differentially Private Federated Averaging (DP-FedAvg) technique.
no code implementations • 12 Jun 2020 • Om Thakkar, Swaroop Ramaswamy, Rajiv Mathews, Françoise Beaufays
In this paper, we initiate a formal study to understand the effect of different components of canonical FL on unintended memorization in trained models, comparing with the central learning setting.
no code implementations • 24 Jan 2020 • Mary Gooneratne, Khe Chai Sim, Petr Zadrazil, Andreas Kabel, Françoise Beaufays, Giovanni Motta
Training machine learning models on mobile devices has the potential of improving both privacy and accuracy of the models.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+1
no code implementations • 14 Dec 2019 • Khe Chai Sim, Françoise Beaufays, Arnaud Benard, Dhruv Guliani, Andreas Kabel, Nikhil Khare, Tamar Lucassen, Petr Zadrazil, Harry Zhang, Leif Johnson, Giovanni Motta, Lillian Zhou
With speech input, if the user corrects only the names, the name recall rate improves to 64. 4%.
no code implementations • 3 Dec 2019 • Daan van Esch, Elnaz Sarbar, Tamar Lucassen, Jeremy O'Brien, Theresa Breiner, Manasa Prasad, Evan Crew, Chieu Nguyen, Françoise Beaufays
Today, Gboard supports 900+ language varieties across 70+ writing systems, and this report describes how and why we have been adding support for hundreds of language varieties from around the globe.
1 code implementation • 22 Oct 2019 • Kangkang Wang, Rajiv Mathews, Chloé Kiddon, Hubert Eichner, Françoise Beaufays, Daniel Ramage
Federated learning is a distributed, on-device computation framework that enables training global models without exporting sensitive user data to servers.
no code implementations • CONLL 2019 • Mingqing Chen, Ananda Theertha Suresh, Rajiv Mathews, Adeline Wong, Cyril Allauzen, Françoise Beaufays, Michael Riley
The n-gram language models trained with federated learning are compared to n-grams trained with traditional server-based algorithms using A/B tests on tens of millions of users of virtual keyboard.
no code implementations • 14 Sep 2019 • Khe Chai Sim, Petr Zadrazil, Françoise Beaufays
Speaker-independent speech recognition systems trained with data from many users are generally robust against speaker variability and work well for a large population of speakers.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+1
no code implementations • 11 Jun 2019 • Swaroop Ramaswamy, Rajiv Mathews, Kanishka Rao, Françoise Beaufays
We show that a word-level recurrent neural network can predict emoji from text typed on a mobile keyboard.
no code implementations • 26 Mar 2019 • Mingqing Chen, Rajiv Mathews, Tom Ouyang, Françoise Beaufays
We demonstrate that a character-level recurrent neural network is able to learn out-of-vocabulary (OOV) words under federated learning settings, for the purpose of expanding the vocabulary of a virtual keyboard for smartphones without exporting sensitive text to servers.
1 code implementation • 7 Dec 2018 • Timothy Yang, Galen Andrew, Hubert Eichner, Haicheng Sun, Wei Li, Nicholas Kong, Daniel Ramage, Françoise Beaufays
Federated learning is a distributed form of machine learning where both the training data and model training are decentralized.
6 code implementations • 8 Nov 2018 • Andrew Hard, Kanishka Rao, Rajiv Mathews, Swaroop Ramaswamy, Françoise Beaufays, Sean Augenstein, Hubert Eichner, Chloé Kiddon, Daniel Ramage
We train a recurrent neural network language model using a distributed, on-device learning framework called federated learning for the purpose of next-word prediction in a virtual keyboard for smartphones.
no code implementations • 13 Apr 2017 • Tom Ouyang, David Rybach, Françoise Beaufays, Michael Riley
We describe the general framework of what we call for short the keyboard "FST decoder" as well as the implementation details that are new compared to a speech FST decoder.
no code implementations • 24 Jul 2015 • Haşim Sak, Andrew Senior, Kanishka Rao, Françoise Beaufays
We have recently shown that deep Long Short-Term Memory (LSTM) recurrent neural networks (RNNs) outperform feed forward deep neural networks (DNNs) as acoustic models for speech recognition.
1 code implementation • 5 Feb 2014 • Haşim Sak, Andrew Senior, Françoise Beaufays
However, in contrast to the deep neural networks, the use of RNNs in speech recognition has been limited to phone recognition in small scale tasks.