no code implementations • 5 Feb 2021 • Ruizhi Li, Gregory Sell, Hynek Hermansky
Performance degradation of an Automatic Speech Recognition (ASR) system is commonly observed when the test acoustic condition is different from training.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
no code implementations • 23 Oct 2019 • Ruizhi Li, Gregory Sell, Xiaofei Wang, Shinji Watanabe, Hynek Hermansky
The multi-stream paradigm of audio processing, in which several sources are simultaneously considered, has been an active research area for information fusion.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1
no code implementations • 17 Jun 2019 • Ruizhi Li, Xiaofei Wang, Sri Harish Mallidi, Shinji Watanabe, Takaaki Hori, Hynek Hermansky
Two representative framework have been proposed and discussed, which are Multi-Encoder Multi-Resolution (MEM-Res) framework and Multi-Encoder Multi-Array (MEM-Array) framework, respectively.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1
no code implementations • 9 Apr 2019 • Ruizhi Li, Gregory Sell, Hynek Hermansky
Measuring performance of an automatic speech recognition (ASR) system without ground-truth could be beneficial in many scenarios, especially with data from unseen domains, where performance can be highly inconsistent.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1
no code implementations • 8 Apr 2019 • Xiaofei Wang, Jinyi Yang, Ruizhi Li, Samik Sadhu, Hynek Hermansky
Quality of data plays an important role in most deep learning tasks.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
no code implementations • 12 Nov 2018 • Ruizhi Li, Xiaofei Wang, Sri Harish Mallidi, Takaaki Hori, Shinji Watanabe, Hynek Hermansky
In this work, we present a novel Multi-Encoder Multi-Resolution (MEMR) framework based on the joint CTC/Attention model.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1
no code implementations • 12 Nov 2018 • Xiaofei Wang, Ruizhi Li, Sri Harish Mallid, Takaaki Hori, Shinji Watanabe, Hynek Hermansky
Automatic Speech Recognition (ASR) using multiple microphone arrays has achieved great success in the far-field robustness.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1
no code implementations • 4 Oct 2018 • Jaejin Cho, Murali Karthick Baskar, Ruizhi Li, Matthew Wiesner, Sri Harish Mallidi, Nelson Yalta, Martin Karafiat, Shinji Watanabe, Takaaki Hori
In this work, we attempt to use data from 10 BABEL languages to build a multi-lingual seq2seq model as a prior model, and then port them towards 4 other BABEL languages using transfer learning approach.
Language Modelling Sequence-To-Sequence Speech Recognition +2
1 code implementation • 9 Oct 2015 • Kairan Sun, Xu Wei, Gengtao Jia, Risheng Wang, Ruizhi Li
Faced with continuously increasing scale of data, original back-propagation neural network based machine learning algorithm presents two non-trivial challenges: huge amount of data makes it difficult to maintain both efficiency and accuracy; redundant data aggravates the system workload.