no code implementations • 17 May 2022 • Mostafa Karimi, Changliang Liu, Kenichi Kumatani, Yao Qian, Tianyu Wu, Jian Wu
Self-supervised learning (SSL) methods have proven to be very successful in automatic speech recognition (ASR).
Automatic Speech Recognition Automatic Speech Recognition (ASR) +4
no code implementations • 10 Dec 2021 • Kenichi Kumatani, Robert Gmyr, Felipe Cruz Salinas, Linquan Liu, Wei Zuo, Devang Patel, Eric Sun, Yu Shi
The sparsely-gated Mixture of Experts (MoE) can magnify a network capacity with a little computational complexity.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1
no code implementations • 10 Dec 2021 • Kenichi Kumatani, Dimitrios Dimitriadis, Yashesh Gaur, Robert Gmyr, Sefik Emre Eskimez, Jinyu Li, Michael Zeng
For untranscribed speech data, the hypothesis from an ASR system must be used as a label.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +4
no code implementations • 19 Oct 2021 • Tae Jin Park, Kenichi Kumatani, Dimitrios Dimitriadis
Federated Learning is a fast growing area of ML where the training datasets are extremely distributed, all while dynamically changing over time.
no code implementations • 15 Oct 2021 • Rimita Lahiri, Kenichi Kumatani, Eric Sun, Yao Qian
Multilingual end-to-end(E2E) models have shown a great potential in the expansion of the language coverage in the realm of automatic speech recognition(ASR).
Automatic Speech Recognition Automatic Speech Recognition (ASR) +4
no code implementations • 12 Jul 2021 • Chengyi Wang, Yu Wu, Shujie Liu, Jinyu Li, Yao Qian, Kenichi Kumatani, Furu Wei
Recently, there has been a vast interest in self-supervised learning (SSL) where the model is pre-trained on large scale unlabeled data and then fine-tuned on a small labeled dataset.
no code implementations • 14 Jun 2021 • Dimitrios Dimitriadis, Kenichi Kumatani, Robert Gmyr, Yashesh Gaur, Sefik Emre Eskimez
The proposed scheme is based on a weighted gradient aggregation using two-step optimization to offer a flexible training pipeline.
3 code implementations • 19 Jan 2021 • Chengyi Wang, Yu Wu, Yao Qian, Kenichi Kumatani, Shujie Liu, Furu Wei, Michael Zeng, Xuedong Huang
In this paper, we propose a unified pre-training approach called UniSpeech to learn speech representations with both unlabeled and labeled data, in which supervised phonetic CTC learning and phonetically-aware contrastive self-supervised learning are conducted in a multi-task learning manner.
no code implementations • 6 Aug 2020 • Dimitrios Dimitriadis, Kenichi Kumatani, Robert Gmyr, Yashesh Gaur, Sefik Emre Eskimez
The target scenario is Acoustic Model training based on this platform.
no code implementations • 6 Feb 2020 • Taejin Park, Kenichi Kumatani, Minhua Wu, Shiva Sundaram
In this paper, we further develop this idea and use frequency aligned network for robust multi-channel automatic speech recognition (ASR).
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
no code implementations • 1 Feb 2020 • Sanna Wager, Aparna Khare, Minhua Wu, Kenichi Kumatani, Shiva Sundaram
Using a large offline teacher model trained on beamformed audio, we trained a simpler multi-channel student acoustic model used in the speech recognition system.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1
no code implementations • 5 Jan 2019 • Ladislav Mošner, Minhua Wu, Anirudh Raju, Sree Hari Krishnan Parthasarathi, Kenichi Kumatani, Shiva Sundaram, Roland Maas, Björn Hoffmeister
For real-world speech recognition applications, noise robustness is still a challenge.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1